Closed actuy closed 3 years ago
It employs 3x schedule, as in this config file. But we sweep over different drop_path_rate
, and weight_decay
. The drop_path_rate
has a large impact on the results.
- name: drop_path_rate
spec: discrete
values: [0.0,0.1,0.2]
- name: weight_decay
spec: discrete
values: [0.05]
Thank you! I will try it again with your configuration. Btw, which part of the parameters did you use as the feature extractor in downstream tasks? teacher
or student
?
teacher
Hi, I’m wondering if you can provide a recipe to reproduce the results of CoCo detection? I’ve tried to use your pre-trained checkpoint to train the downstream task with Mask R-CNN, but cannot get the results reported in the paper. Not sure if there was something wrong during the training. Could you please provide more details? Thank you!
hi, Could you share how to load the pre-trained model? When I load the model, there are the following errors :unexpected key in source state_dict: student, teacher, optimizer, epoch, args, dino_loss, fp16_scaler
Hi, I’m wondering if you can provide a recipe to reproduce the results of CoCo detection? I’ve tried to use your pre-trained checkpoint to train the downstream task with Mask R-CNN, but cannot get the results reported in the paper. Not sure if there was something wrong during the training. Could you please provide more details? Thank you!