Closed 4-0-4-notfound closed 2 years ago
Thanks for opening the issue. Seems like I've forgotten to update the lr_drop
schedule in the code, which leads to the learning rate dropping too early. I've created a PR and I will test it soon to make sure the result is now reproduced. You can also build on this git branch in the meantime if you need to.
Hi, @amirbar, I can not reproduce the result even update the lr_drop
, are there any other changes we need to do?
Thanks for the reminder @BIGBALLON. I'm running experiments to reproduce it now on my internal repo and based on this repo. I'm keeping this issue open for now, and I'll circle back if this indeed reproduces on my end and if yes will apply a fix.
As a sanity check, I've partially trained from MSCoco checkpoint for 800 epochs using 1% MSCoco data. After 800 epochs, this model achieves 25.4 AP. I assume that with more careful choice of checkpoint/more training it will improve to 26 AP like the reported result.
That's the running command I've used:
export GPUS_PER_NODE=8 && bash ./tools/run_dist_launch.sh 8 ./configs/DETReg_fine_tune_1pct_coco.sh --batch_size 1 --pretrain pretrained/checkpoint_coco.pth --output_dir exps/coco1pct_fromcoco --eval_every 200
Here's the log file in case it is helpful: log.txt
Edit: this result is based on this PR #40 which is now merged to main.
Hope this helps!
@amirbar thanks for your helpful reply. it would be better to update the parameters in the readme.
当我使用此检查点作为预训练时
并使用这些脚本重现半监督学习实验
结果证明是巨大的差异:
请帮助我,我在复制时遗漏了什么吗?
顺便说一句,我可以重现完整的 COCO 结果@45.5AP。所以康达环境可能是对的。 Excuse me, hello. Why do I get the error shown in the picture when I use the command to evaluate? Could you give me some tips?
When i using this checkpoint as pretrain
and using these script to reproducing the Semi-supervised Learning experiment
the result turns out to be huge difference :
Please help me, did i missing anything in reproducing ?
By the way, i can reproduce the full COCO result @45.5AP. So the conda env is probably right.