FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
https://arxiv.org/abs/2406.06525
MIT License
1.19k stars 46 forks source link

KeyError: 'optimizer' #45

Open sugary199 opened 1 month ago

sugary199 commented 1 month ago

@PeizeSun hi, I met the same problem when funetuning the t2i model:

bash scripts/tokenizer/train_vq_finetune_continue.sh --cloud-save-path /data/vjuicefs_ai_camera_llm/11170092/00proj/LlamaGen/cloud_save --data-path /data/vjuicefs_ai_camera_llm/public_data/vivo_internal_data/AIPortrait/crop_imgs_ffhqcrop_png/20230703_fourth_fix-1_625 --image-size 256 --vq-model VQ-16 --dataset  coco --global-batch-size 32"

Does vq_ds16_t2i.pt also need to be updated?

Traceback (most recent call last):
  File "/data/vjuicefs_ai_camera_llm/11170092/00proj/LlamaGen/tokenizer/tokenizer_image/vq_train.py", line 320, in <module>
    main(args)
  File "/data/vjuicefs_ai_camera_llm/11170092/00proj/LlamaGen/tokenizer/tokenizer_image/vq_train.py", line 150, in main
    optimizer.load_state_dict(checkpoint["optimizer"])
KeyError: 'optimizer'
Traceback (most recent call last):
  File "/data/vjuicefs_ai_camera_llm/11170092/00proj/LlamaGen/tokenizer/tokenizer_image/vq_train.py", line 320, in <module>
    main(args)
  File "/data/vjuicefs_ai_camera_llm/11170092/00proj/LlamaGen/tokenizer/tokenizer_image/vq_train.py", line 150, in main
    optimizer.load_state_dict(checkpoint["optimizer"])
KeyError: 'optimizer'
sugary199 commented 1 month ago

Or did I make a mistake somewhere? I would be very grateful if you could give me some help!

PeizeSun commented 1 month ago

Hi~ This is not the mistake in your side. It is because our released vq_ds16_t2i.pt doesn’t include optimizer parameters.

sugary199 commented 1 month ago

Hi~ This is not the mistake in your side. It is because our released vq_ds16_t2i.pt doesn’t include optimizer parameters.

Thank you for response! I was wondering if there are any plans to release an updated version with the necessary optimizer parameters? Alternatively, is it possible to train without theses parameters? If you could provide some guidance or assistance, it would be extremely helpful for my work.