Open cpis7 opened 5 months ago
Firstly, we perform pre-training at a resolution of 512x512. Then, we employ a multi-scale strategy for fine-tuning.
Thank you for answering! can you please tell me which scales did you use? did you use multi-scale strategy noted on sdxl paper?
3 stages: (1) 512x512 (2) buckets = [ [768, 768], [960, 640], [640, 960], [768, 896], [896, 768], [768, 832], [832, 768], [768, 960], [960, 768], [768, 1024], [1024, 768], [704, 1024], [1024, 704], [1024, 640], [640, 1024],
]
(3) buckets = [ [1024, 1024], [768, 1280], [1280, 768], [832, 1216], [1126, 832], [832, 1152], [896, 1152], [1152, 896], [1152, 832], [960, 1088], [1088, 960], [896, 1088], [1088, 896], [960, 1024], [1024, 960] ]
How much memory do you use when training with SDXL
using 40GB 8xA100s, batch size can be 8*8 at 512x512
If I train with 256, will it be much less effective
If I train with 256, will it be much less effective
I don't test, but I think you should train at 512+
Why do I reduce the size of the image, but the memory footprint and no significant decline, even though I - resolution = 8 -- he will error torch train_batch_size = 1.
Cuda. OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 23.70 GiB total capacity; 21.79 GiB already allocated; 7.56 MiB free; 22.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
SDXL was trained on 1024 so how did you train on 512 images, which is out of distribution. Did you add a lora to the unet to be able to support low res images ? If so did you train ip-adapter+lora together ?
SDXL was trained on 1024 so how did you train on 512 images, which is out of distribution. Did you add a lora to the unet to be able to support low res images ? If so did you train ip-adapter+lora together ?
pretrain at 512x512 and the fintune at 1024x1024
Thank you for your great contribution! I have a question about the training on SDXL model. Did you use 1024x1024 images for training on SDXL ipadapter model since pretrained SDXL model was trained with 1024x1024 images? or did you use 512x512 images? Thank you