Open iyu-Fang opened 4 years ago
Hi @iyu-Fang At 30k iteration we applied the teacher loss (gradually). https://github.com/NVlabs/DG-Net/blob/c0ee2dff34662b10e904eb08249c14661f2306b1/trainer.py#L492
Hi @layumi Thank you for your advice.
Since I did not modify the batch size, I've tried to check the performance of the teacher model. I test the performance of the model (best) you provide, I got 0.81 Rank@1 and 0.54 mAP. However, even if I retrain a new teacher model and it works well, DG-Net still can not converge. BTW, I've set the _max_teacherw to 0.2, it still works badly so far.
I will be appreciated if you could give me some further suggestions.
Hi @iyu-Fang The teacher model performance is not right. Please check the version of your numpy.
https://github.com/layumi/Person_reID_baseline_pytorch#prerequisites Some reports found that updating numpy can arrive the right accuracy. If you only get 50~80 Top1 Accuracy, just try it. We have successfully run the code based on numpy 1.12.1 and 1.13.1 .
I thought that‘s not the problem. My numpy version is 1.19.1. Or could you tell me the exact version of your environment(numpy, pytorch, etc.) when you run your experiments?
Hi @iyu-Fang Could you try to run https://github.com/layumi/Person_reID_baseline_pytorch and check the result?
@layumi Actually, that's exactly how I tested. The best model got 0.810 Rank1 and 0.543 mAP, while the re-trained model (ResNet-50(all tricks)) tested 0.914 Rank1 and 0.778 mAP. But even though I use the re-trained model as my teacher model, DG-Net still cannot converge.
@iyu-Fang Did you run the model on Market-1501 or other datasets? Do you load the model config correctly?
@layumi Thank you for your quick response. Yes, I run the model on Market-1501 dataset. As for the config, the best model you provide does not use_NAS parameter, so I added it to the config and set it false. Nothing else was changed.
@iyu-Fang The teacher model should achieve about 89.6% Rank@1 and 74.5% mAP. I am not sure whether there are any other difference.
Hi, Thank you for your work.
When I tried to train my model according to the parameters you provided in configs.yaml, I found that I could not reproduce your result in the Market dataset. If I use the visual_tools to show the rainbow image, the generated image will produce the wrong color (the color is even different from the input images). After that, I've checked the loss in the tensorboard. I found that the total loss, as well as id loss, surged to very high values at 30k iteration. And then they could never converge. However, I downloaded the best model you provided and tested it in the same way, it works well.
Please give me some advice.