Open tyunit opened 2 years ago
@yujheli Thank you for your response I reduce the batch size and the MIN_SIZE_TRAIN but I didn't get the expected result. I reduced MIN_SIZE_TRAIN to (300,) do it have effect on the results?After almost 5 days of training the mAP stick around 16,17 and 18.How long it take to get those results.
I would suggest not changing the MIN_SIZE_TRAIN. Also, you know we report AP@50, not AP(COCO evaluation) in our paper right?
I tried reducing the batch size to 1 without changing the MIN_SIZE_TRAIN but it still says RuntimeError: CUDA out of memory. Here is my GPU specification
It works on my machine using one GPU and run the script by:
python train_net.py --num-gpus 1 --config configs/faster_rcnn_R101_cross_clipart.yaml SOLVER.IMG_PER_BATCH_LABEL 1 SOLVER.IMG_PER_BATCH_UNLABEL 1 OUTPUT_DIR output/exp_clipart_test
which only uses 7G on 2080Ti
it works but the AP values for each classes are very low compared to the stated results on the paper. these is the final result after long time of training.
@tyunit Please see AP50 and AP50 for subclass (in the tensorboard). You already have 44.76 which is outperform all of the previous SOTA in my paper. I think you need to try different loss weight and can get best performance.
there is AP result for each class but I didn't find AP50 for each class on the tensorboard. isn't it possible to get the AP50 result on the evaluation like AP on the command line?
@tyunit Please see the code of reporting AP50 for the tensorboard at the link: for pascal voc format: https://github.com/facebookresearch/adaptive_teacher/blob/main/adapteacher/evaluation/pascal_voc_evaluation.py#L120 for coco format: https://github.com/facebookresearch/adaptive_teacher/blob/main/prod_lib/evaluation/coco_evaluation.py#L418
They are all in the codes of this repo yet I haven't migrated them into this version of trainer.py. Will migrate them soon.
它有效,但与论文上所述结果相比,每个类的AP值非常低。 这是经过长时间训练后的最终结果。
Can you share your version of python, pytorch, etc.? I'm having trouble debugging, I suspect it's a version problem
@tyunit Please see AP50 and AP50 for subclass (in the tensorboard). You already have 44.76 which is outperform all of the previous SOTA in my paper. I think you need to try different loss weight and can get best performance.
Hi, I came across this issue, which seems to be about the unfair data usage. Moreover, I noticed the recently published work https://arxiv.org/abs/2206.06293 ,which shows that this unfair usage will give around a 4~5 mAP difference on City to Foggy. Could the author provide the results with Foggy 0.02 only? Thanks so much! Otherwise, I will think it is very very misleading.
Hello, could you please share the configuration file you ran out? I also had the following problems during the experiment (the problem occurred in the first iteration) : FloatingPointError: Predicted boxes or scores contain Inf/NaN
. Training has diverged. After trying through the problem set, changing the weights and updating the virtual machine environment, the problem was still not resolved. What is your operating environment? I tried single gpu and multi-GPU (4). My experimental environment is python3.8,torch=1.9,cuda=11.6,detectron2=0.5
它有效,但与论文上声明的结果相比,每个类的 AP 值非常低。这些是经过长时间训练的最终结果。
Excuse,Iam a beginner, can I ask how the AP of each category is obtained? Thank you so much!
@tyunit You can set num_gpus as 1. Remember to feed the proper small batch size for labeled and unlabeled images.