Issues in loading checkpoints of embeddings & spatial encoder

KenLUoHere commented 1 month ago

My training was interrupted, and I wanted to recover it from the checkpoint, so I used order underneath to run the main.py: CUDA_VISIBLE_DEVICES=0 python main.py --spatial_encoder_embedding ./logs/anomaly-checkpoints/checkpoints/spatial_encoder.pt --data_enhance --base configs/latent-diffusion/txt2img-1p4B-finetune-encoder+embedding.yaml -t --actual_resume models/ldm/text2img-large/model.ckpt -n test --gpus 0 --init_word anomaly --mvtec_path ./dataset/mvtec_anomaly_detection --embedding_manager_ckpt ./logs/anomaly-checkpoints/checkpoints/embeddings.pt but it kept saying that there're 2 devices holding tensors, in the line 871 trainer.fit(model, data), which makes me so confused. Which I showed undeneath:

KenLUoHere commented 1 month ago

I guess it's the checkpoint problem, maybe i didnt load it well, so if you find the format of my order have any problem or anywhere i didnt get, plz help me, so much gratitude

sjtuplayer commented 1 week ago

It seems that something your model or ckpt are on cpy and cuda separately, you can have a check

sjtuplayer / anomalydiffusion

Issues in loading checkpoints of embeddings & spatial encoder #66