haochenheheda / LVVIS

Large-Vocabulary Video Instance Segmentation dataset
GNU General Public License v3.0
76 stars 1 forks source link

Clarification on Training Details #10

Closed yxchng closed 1 year ago

yxchng commented 1 year ago
  1. Why are the batch size between that mentioned in the paper (8) and the provided training code (2) different?

Screenshot from 2023-11-06 22-04-37

python train_net.py --num-gpus 1 \
    --config-file configs/lvvis/instance-segmentation/ov2seg_R50_bs16_50ep_lvis.yaml \
    SOLVER.IMS_PER_BATCH 2 \
    MODEL.MASK_FORMER.CLIP_CLASSIFIER True \
    OUTPUT_DIR output/ov2seg \
    MODEL.MASK_FORMER.NUM_OBJECT_QUERIES 100 \
    MODEL.MASK_FORMER.DEC_LAYERS 7
  1. Do Res50 and Swin use the same training setting, i.e. same training code be used?
haochenheheda commented 1 year ago
  1. We trained the model on four GPUs, the script above is for one GPU, we may change the --num-gpus and SOLVER.IMS_PER_BATCH to 4 and 8, respectively
  2. Yes, the res50 and swin use the same code and parameter. You only need to change the backbone in the config file.