Open bschroedr opened 10 months ago
I am trying to run the inference code (sampler) and the loading of the pretrained models fails. I tried some of the suggestions from another submitted issue, but run into different error with the vq-f8 model:
LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 395.77 M params. Keeping EMAs of 630. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Restored from pretrained/vq-f8-model.ckpt with 0 missing and 49 unexpected keys _pickle.UnpicklingError: invalid load key, '<'.
Can you suggest what to do? I am using the code available as of today.
Please train the model first, then you can run the inference code.
I am trying to run the inference code (sampler) and the loading of the pretrained models fails. I tried some of the suggestions from another submitted issue, but run into different error with the vq-f8 model:
LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 395.77 M params. Keeping EMAs of 630. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Restored from pretrained/vq-f8-model.ckpt with 0 missing and 49 unexpected keys _pickle.UnpicklingError: invalid load key, '<'.
Can you suggest what to do? I am using the code available as of today.
vq-f8 is to embed image to latent, and it is not the latent diffusion model. You need to train the latent diffusion model first as introduced in README.MD.
The previous error about "with 0 missing and 49 unexpected keys _pickle.UnpicklingError: invalid load key, '<'." was due to a corrupted download of the sip_vg.pt model. Once I downloaded it again, the problem went away, however, I get the following error when training with config_vg.yaml:
Traceback (most recent call last):
File "trainer.py", line 382, in
The previous error about "with 0 missing and 49 unexpected keys _pickle.UnpicklingError: invalid load key, '<'." was due to a corrupted download of the sip_vg.pt model. Once I downloaded it again, the problem went away, however, I get the following error when training with config_vg.yaml:
Traceback (most recent call last): File "trainer.py", line 382, in model = instantiate_from_config(config.model) AttributeError: 'int' object has no attribute 'strip'
Please check that the versions of all installed packages in your conda environments are consistent with the sgdiff.yaml provided by us. We just re-train the model on a 3090Ti (CUDA 11.4), and we did not find similar errors.
I installed the project exactly as specified so it wasn't a versioning issue. The issue was the following lines of code in trainer.py (518):
if not cpu: ngpu = len(lightning_config.trainer.gpus.strip(",").split(',')) else: ngpu = 1
This code will fail, giving the error I mentioned earlier, if only one GPU is specified since it is not contained in a list. The way to work around this is to use a comma after the GPUs flag:
python trainer.py --base ./config_vg.yaml -t --gpus 1,
That solved the problem for me. The problem is that I can't train the model on my two 12 GB GPUs because I get CUDA out of memory errors:
RuntimeError: CUDA out of memory. Tried to allocate 252.00 MiB (GPU 0; 11.92 GiB total capacity; 10.66 GiB already allocated; 173.12 MiB free; 10.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I've tried reducing the batch size to 4 images and still have this issue. I am not sure what to do - do you have any suggestions?
Could you make a pretrained model available?
I am trying to run the inference code (sampler) and the loading of the pretrained models fails. I tried some of the suggestions from another submitted issue, but run into different error with the vq-f8 model:
LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 395.77 M params. Keeping EMAs of 630. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels making attention of type 'vanilla' with 512 in_channels Restored from pretrained/vq-f8-model.ckpt with 0 missing and 49 unexpected keys _pickle.UnpicklingError: invalid load key, '<'.
Can you suggest what to do? I am using the code available as of today.