dyabel / detpro

Apache License 2.0
171 stars 26 forks source link

train time for vild #19

Closed ghost closed 2 years ago

ghost commented 2 years ago

I use the sh and config ./tools/slurm_train.sh a100 vild configs/lvis/detpro_ens_20e.py workdirs/vild_ens_20e_fg_bg_5_10_end --cfg-options model.roi_head.load_feature=True to reproduce vild with 8 32g a100 batchsize 24(83)

Screen Shot 2022-05-14 at 11 45 00 PM

and in the issue, you claim only 0.75s per iter. but for me, it is 6s。20epoch cost about 30 days.

dyabel commented 2 years ago

I use the sh and config ./tools/slurm_train.sh a100 vild configs/lvis/detpro_ens_20e.py workdirs/vild_ens_20e_fg_bg_5_10_end --cfg-options model.roi_head.load_feature=True to reproduce vild with 8 32g a100 batchsize 24(83) Screen Shot 2022-05-14 at 11 45 00 PM and in the issue, you claim only 0.75s per iter. but for me, it is 6s。20epoch cost about 30 days.

I guess you have got some path wrong. Can you check whether the lvis_clip_image_embedding.zip has been successfully loaded.

ghost commented 2 years ago
WeChat5f308d778f4f25e21959c170043f03c5

I put lvis_clip_image_embedding.zip under ./data and unzip lvis_clip_image_embedding.zip to /data/lvis_clip_image_embedding. like home/detpro/data/lvis_clip_image_embedding/train2017/000000000030.pth

@dyabel

dyabel commented 2 years ago
WeChat5f308d778f4f25e21959c170043f03c5

I put lvis_clip_image_embedding.zip under ./data and unzip lvis_clip_image_embedding.zip to /data/lvis_clip_image_embedding. like home/detpro/data/lvis_clip_image_embedding/train2017/000000000030.pth

@dyabel

Then there should be no problem, the only reason I can think of for costing such long time is that the pre-extracted embeddings are not loaded correctly, then the code will choose to do the clip forwarding process online.