Open Retiina opened 3 years ago
Which base training checkpoint (model_final.pth
) you are using? You train it yourself or downloaded from FsDet.
Download it from FsDet. http://dl.yf.io/fs-det/models/voc/split1/base_model/model_final.pth
Maybe this yaml may help.
Because of such few instances and early-stop strategy to prevent overfitting, the unstable result is normal.
Sir, thanks for your reply. I will try this config and will inform you when I get the result.
@Retiina @yhcao6
Hi guys, will you be able to reproduce the nAP for 10shot PASCAL VOC split1?
@bsun0802 Have not tried 10 shot but still can't reproduce the map for 1 shot and 3 shot of FSCE, but can reproduce the mAP of the improved TFA.
those shots are unstable and can have large variance in different run.
please try if 5-shot and 10-shot can be reproduced, another thread finds not, if so, we need to revise what's wrong.
Thanks.
Sure, I will try it now
What if I only use 1 gpu? Will this affect the result?
In addition, the config file shows that the backbone seems to be trained.
However, during training, it shows
@bsun0802 , this is the result of split1 10shot of FSCE, still lower than paper
[03/29 16:10:55 fsdet.evaluation.pascal_voc_evaluation]: Evaluating voc_2007_test_all1 using 2007 metric. Note that results do not use the official Matlab API.
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate per-class mAP50:
| aeroplane | bicycle | boat | bottle | car | cat | chair | diningtable | dog | horse | person | pottedplant | sheep | train | tvmonitor | bird | bus | cow | motorbike | sofa |
|:-----------:|:---------:|:------:|:--------:|:------:|:------:|:-------:|:-------------:|:------:|:-------:|:--------:|:-------------:|:-------:|:-------:|:-----------:|:------:|:------:|:------:|:-----------:|:------:|
| 85.873 | 85.576 | 66.554 | 67.776 | 87.936 | 88.366 | 63.925 | 64.852 | 85.325 | 85.267 | 78.948 | 49.353 | 76.907 | 85.293 | 77.127 | 41.074 | 75.418 | 68.892 | 68.620 | 54.369 |
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate overall bbox:
| AP | AP50 | AP75 | bAP | bAP50 | bAP75 | nAP | nAP50 | nAP75 |
|:------:|:------:|:------:|:------:|:-------:|:-------:|:------:|:-------:|:-------:|
| 45.616 | 72.873 | 48.901 | 48.464 | 76.605 | 51.869 | 37.073 | 61.674 | 40.000 |
[03/29 16:11:13 fsdet.engine.defaults]: Evaluation results for voc_2007_test_all1 in csv format:
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: Task: bbox
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: 45.6161,72.8726,48.9014,48.4638,76.6053,51.8685,37.0729,61.6745,40.0001
[03/29 16:11:13 fsdet.utils.events]: eta: 0:00:00 iter: 14999 total_loss: 0.4749 loss_cls: 0.04555 loss_box_reg: 0.04276 loss_contrast: 0.3756 loss_rpn_cls: 0.002355 loss_rpn_loc: 0.004225 time: 0.4634 data_time: 0.0396 lr: 0.00025 max_mem: 2058M
[03/29 16:11:14 fsdet.engine.hooks]: Overall training speed: 14996 iterations in 1:55:51 (0.4635 s / it)
[03/29 16:11:14 fsdet.engine.hooks]: Total training time: 3:16:01 (1:20:10 on hooks)
What if I only use 1 gpu? Will this affect the result? In addition, the config file shows that the backbone seems to be trained.
However, during training, it shows
== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,
@bsun0802 , there is the result of split1 10shot of FSCE, still lower than paper
[03/29 16:10:55 fsdet.evaluation.pascal_voc_evaluation]: Evaluating voc_2007_test_all1 using 2007 metric. Note that results do not use the official Matlab API. [03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate per-class mAP50: | aeroplane | bicycle | boat | bottle | car | cat | chair | diningtable | dog | horse | person | pottedplant | sheep | train | tvmonitor | bird | bus | cow | motorbike | sofa | |:-----------:|:---------:|:------:|:--------:|:------:|:------:|:-------:|:-------------:|:------:|:-------:|:--------:|:-------------:|:-------:|:-------:|:-----------:|:------:|:------:|:------:|:-----------:|:------:| | 85.873 | 85.576 | 66.554 | 67.776 | 87.936 | 88.366 | 63.925 | 64.852 | 85.325 | 85.267 | 78.948 | 49.353 | 76.907 | 85.293 | 77.127 | 41.074 | 75.418 | 68.892 | 68.620 | 54.369 | [03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate overall bbox: | AP | AP50 | AP75 | bAP | bAP50 | bAP75 | nAP | nAP50 | nAP75 | |:------:|:------:|:------:|:------:|:-------:|:-------:|:------:|:-------:|:-------:| | 45.616 | 72.873 | 48.901 | 48.464 | 76.605 | 51.869 | 37.073 | 61.674 | 40.000 | [03/29 16:11:13 fsdet.engine.defaults]: Evaluation results for voc_2007_test_all1 in csv format: [03/29 16:11:13 fsdet.evaluation.testing]: copypaste: Task: bbox [03/29 16:11:13 fsdet.evaluation.testing]: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75 [03/29 16:11:13 fsdet.evaluation.testing]: copypaste: 45.6161,72.8726,48.9014,48.4638,76.6053,51.8685,37.0729,61.6745,40.0001 [03/29 16:11:13 fsdet.utils.events]: eta: 0:00:00 iter: 14999 total_loss: 0.4749 loss_cls: 0.04555 loss_box_reg: 0.04276 loss_contrast: 0.3756 loss_rpn_cls: 0.002355 loss_rpn_loc: 0.004225 time: 0.4634 data_time: 0.0396 lr: 0.00025 max_mem: 2058M [03/29 16:11:14 fsdet.engine.hooks]: Overall training speed: 14996 iterations in 1:55:51 (0.4635 s / it) [03/29 16:11:14 fsdet.engine.hooks]: Total training time: 3:16:01 (1:20:10 on hooks)
@yhcao6 This is the final checkpoint, did you checked the best checkpoint?
@bsun0802 I checked the best nAP50 is 62.346
@yhcao6 Seems odd. I would say above 62.6 should be easy to reach. We will have time to inspect that until this weekend.
Thanks for taking your time to check it.
@yhcao6 Seems odd. I would say above 62.6 should be easy to reach.
This is my rerun today. It does reach 62.5+ without any change.
Since few-shot task is not stable and data reported in paper is the best result in multiple runs, I think slight difference is normal.
@yhcao6 Seems odd. I would say above 62.6 should be easy to reach.
This is my rerun today. It does reach 62.5+ without any change.
Since few-shot task is not stable and data reported in paper is the best result in multiple runs, I think slight difference is normal.
Is this the result on seed 0? thanks for your reply.
@Chauncy-Cai Thanks for your reply. One possible reason may come from the randomness of surgery. If convenient would you like to upload your model_reset_surgery.pth
?
Yes, just as TFA , seed 0 actually is manually sampled. Thus, it always has the best result.
What does seed0 mean? In http://dl.yf.io/fs-det/datasets/vocsplit/, there no seed0 folder.
In Table1, the performance on 10-shot is 61.4 while in Table2, the result is 63.4 and the average over 10 random seeds is 59.7. These results are confusing.
I use your base model and train it 'Stage 2: Fine-tune for novel data' with 4 gpus, but the results are much lower than the reported. I use the txt file of seed1 folder.
What does seed0 mean? In http://dl.yf.io/fs-det/datasets/vocsplit/, there no seed0 folder.
OK,I should describe more accurately. "http://dl.yf.io/fs-det/datasets/vocsplit/*.txt" instead of "seed0".
![]()
In Table1, the performance on 10-shot is 61.4 while in Table2, the result is 63.4 and the average over 10 random seeds is 59.7. These results are confusing.
All experiment,except the average performance among 10 random seed in table 2, we have done is based on "http://dl.yf.io/fs-det/datasets/vocsplit/*.txt".
I use your base model and train it 'Stage 2: Fine-tune for novel data' with 4 gpus, but the results are much lower than the reported. I use the txt file of seed1 folder.
First, we get the result based on 8 gpus for training/finetune. Thus, we don't know the performance with 4 gpus. Moreover, nAP greatly depends on the finetune data you choose.
@Chauncy-Cai Can you tell me how to get 59.7 in over 10 random seeds? Just use the source code to train 10 times? Or edit this code? meta_pascal_voc.py line 69 split_dir = os.path.join(split_dir, "seed{}".format(1))
Just simply train the code with data in "seeds [1-10]" files in "http://dl.yf.io/fs-det/datasets/vocsplit/". You can change the train&test dataset in yaml directly. For instance, (coco_trainval_all_30shot) ->(coco_trainval_all_30shot_seed1) to use seed1 file.
@Chauncy-Cai Thanks for you reply !
Recently, I used the original split-1 10-shot config and the base model downloaded from FsDet with 8 GPUs to train a model. But why can't I reproduce the result over 10 random seed?
I only get bAP50 is 71.6 and nAP50 is 57.5. I use the final checkpoint but not the best checkpoint. Does the final model need to combine the base model classifier and the fine-tuned classifier?
Could you provide me a model you have trained to reach the expected result?
How can I download txt files at one time from http://dl.yf.io/fs-det/datasets/vocsplit/ ? Do I need to copy them manually?
如果我只使用 1 个 gpu 怎么办?这会影响结果吗? 此外,配置文件显示主干似乎已经过训练。 但是,在训练过程中,它显示
![]()
== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结,FPN 横向和自上而下的 convs 被微调,
What if I only use 1 gpu? Will this affect the result? In addition, the config file shows that the backbone seems to be trained.
However, during training, it shows
== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,
Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?
如果我只使用 1 个 gpu 怎么办?这会影响结果吗? 此外,配置文件显示主干似乎已经过训练。 但是,在训练过程中,它显示
![]()
== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结,FPN 横向和自上而下的 convs 被微调,
What if I only use 1 gpu? Will this affect the result?如果我只使用 1 个 GPU 怎么办?这会影响结果吗? In addition, the config file shows that the backbone seems to be trained.此外,配置文件显示主干似乎经过训练。
However, during training, it shows但是,在训练期间,它显示
== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,== 1 == 我不认为 1 个 gpu 可以重现相同的结果。所有实验均在 8 个 GPU 上进行。== 2 == 冻结 Resnet 层,微调 FPN 横向和自上而下的转换,
Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?为什么 1 个 GPU 不能重现相同的结果?我可以用相同的批次大小和 lr 获得相同的结果吗? May I ask if you have successfully replicated one of your GUPs
如果我只使用 1 个 gpu 怎么办?这会影响结果吗? 此外,配置文件显示主干似乎已经过训练。 但是,在训练过程中,它显示
![]()
== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结,FPN 横向和自上而下的 convs 被微调,
What if I only use 1 gpu? Will this affect the result? In addition, the config file shows that the backbone seems to be trained.
However, during training, it shows
== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,
Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?
May I ask if you have successfully replicated one of your GUPs
Dear author, I am trying to reproduce your paper, but when I run the config of pascal voc split1 shot3, https://github.com/MegviiDetection/FSCE/blob/main/configs/PASCAL_VOC/split1/3shot_CL_IoU.yml, I get the result as follows:
However, in paper the result is reported as 51.4, which is higher than my result 1.7% mAP:
Could you check the config if the setting are totally the same with that in the paper? Or if I did something wrong, this is my training command:
Thanks in advance!