gy65896 / OneRestore

[ECCV 2024] OneRestore: A Universal Restoration Framework for Composite Degradation
https://gy65896.github.io/projects/ECCV2024_OneRestore/index.html
75 stars 7 forks source link

The GPUs for training Text/Visual Embedder #5

Closed Cyyyyyb closed 2 months ago

Cyyyyyb commented 3 months ago

Hi @gy65896

Thanks for your nice work.

I am trying to reproduce your method from scratch.

I would like to know what GPU the text/visual embedder is trained on.

Best regards and many thanks,

gy65896 commented 3 months ago

Thanks for your attention!

Due to the small GPU memory consumption compared with the restoration model, we just used a single 3090 GPU to train text and visual embedded.

Cyyyyyb commented 3 months ago

Thanks for your attention!

Due to the small GPU memory consumption compared with the restoration model, we just used a single 3090 GPU to train text and visual embedded.

Got it, thanks for the quick response!

However, when I used a single 3090 GPU to train text and visual embedded, I reported an error and would like to ask you for help.

Best regards and many thanks,

I use this line command to run the code: python train_Embedder.py --train-dir ./data/CDD-11_train --test-dir ./data/CDD-11_test --check-dir ./ckpts --batch 256 --num-workers 0 --epoch 200 --lr 1e-4 --lr-decay 50

Snipaste_2024-08-05_20-32-11
gy65896 commented 2 months ago

Have you attempted to set batch_size as 128? Perhaps I use this set of parameters on another device.

Zhong-Polyu commented 2 months ago

Thanks for your attention! Due to the small GPU memory consumption compared with the restoration model, we just used a single 3090 GPU to train text and visual embedded.

Got it, thanks for the quick response!

However, when I used a single 3090 GPU to train text and visual embedded, I reported an error and would like to ask you for help.

Best regards and many thanks,

I use this line command to run the code: python train_Embedder.py --train-dir ./data/CDD-11_train --test-dir ./data/CDD-11_test --check-dir ./ckpts --batch 256 --num-workers 0 --epoch 200 --lr 1e-4 --lr-decay 50 Snipaste_2024-08-05_20-32-11

Hi, I encountered the same problem. I found it caused by the memory cost at the validation stage. Add the ''with torch.no_grad():'' after model.val(), solved!

gy65896 commented 2 months ago

Thank you for your mention. I will update the code.