w1oves / Rein

[CVPR 2024] Official implement of <Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation>
https://zxwei.site/rein
GNU General Public License v3.0
250 stars 21 forks source link

Train freeze dinov2 #15

Closed seabearlmx closed 7 months ago

seabearlmx commented 7 months ago

Hello, I use the command "python tools/train.py configs/dinov2/dinov2_mask2former_512x512_bs1x4.py" to train the freeze dinov2.

Since the limitation of the GPUs, the batch size is set to 1, and I use 3 1080Ti to train the model.

Then, I use these commands, including "python tools/test.py configs/dinov2/dinov2_mask2former_512x512_bs1x4.py work_dirs/dinov2_mask2former_512x512_bs1x4/iter_40000.pth --backbone checkpoints/dinov2_converted.pth" or "python tools/test.py configs/dinov2/dinov2_mask2former_512x512_bs1x4.py work_dirs/dinov2_mask2former_512x512_bs1x4/iter_40000.pth --backbone checkpoints/dinov2_vitl14_pretrain.pth" or "python tools/test.py configs/dinov2/dinov2_mask2former_512x512_bs1x4.py work_dirs/dinov2_mask2former_512x512_bs1x4/iter_40000.pth" to test the trained model.

However, the performances are poor. The two former commands are not work to test the model, since the mIoU is 0.0x in each dataset.

The mIoU of using the last command to test the model is 30.3%, 22.8%, and 35.1% in the Cityscapes, BDD, and Mapillary.

So, could you please tell me if I used the wrong test command or if I need to change the number of training iterations because the batch size has been changed to a smaller number? If the number of training iterations needs to be changed, what other relevant parameters need to be changed? such as LR?

w1oves commented 7 months ago

I'll take on the task of fixing this bug. However, I require some time, approximately two day, to resolve it.

w1oves commented 7 months ago

Your logs about these testing may help me.

seabearlmx commented 7 months ago

Here is the first command testing log: 20240327_230048.log

seabearlmx commented 7 months ago

Here is the second command testing log: 20240327_234516.log

seabearlmx commented 7 months ago

Here is the last command testing log: 20240327_233955.log

w1oves commented 7 months ago

I have uploaded the config for training of frozen dinov2-large at here: configs/frozen_vfms/dinov2-L_mask2former.py. You should use this config to set model.type as FrozenBackboneEncoderDecoder. Then you can get correct result with low GPU resource.

seabearlmx commented 7 months ago

Thank you so much!

w1oves commented 7 months ago

In this case.you should run test.py without '--backbone' args