PeizeSun / TransTrack

Multiple Object Tracking with Transformer
MIT License
621 stars 109 forks source link

Training on MultiClass MultiObject Bdd100k Dataset #57

Open sfarkya04 opened 2 years ago

sfarkya04 commented 2 years ago

Hello Dear Authors, First of all, thank you for providing code and wonderful work.

I am trying to use your codebase to perform tracking on bdd100k dataset. https://github.com/bdd100k/bdd100k

The dataset has 8 classes: car(dominated), truck, bus, pedestrian etc. Originally, bdd100k has 1400 training videos which is huge so created a smaller set with 100 videos to train the model The devset has about 20000 training images and 8 classes.

I followed this and modified the build function: https://github.com/PeizeSun/TransTrack/issues/11

Also, you can use for the conversion: https://doc.bdd100k.com/format.html

I gave a dry run on my side to make sure the code is working for bdd to mot converted annotations and it seems to work fine.

I don't think I have to change the num_classes since you already have set it to 20?

Now I am confused by the comment made here: https://github.com/PeizeSun/TransTrack/issues/41#issuecomment-939762956 Essentially it says: "we should make sure the category is kept for the objects to be associated." Not sure where I have to make the change to incorporate that.

Further, I have some other questions to make sure the setup is correct: Resources I have; 4, 48 NVIDIA GPUs to train the model.

  1. Should I change the lr or keep it fixed to 2e-4 (default?) based on num_gpus?
  2. Should I change the batch size pass to the config? Here's my script right now: Please let me know if I have to change anything here or include anything I missed.
    
    python3 -m torch.distributed.launch --nproc_per_node=4  \
                                         --use_env main_track.py \
                                         --output_dir ./results/bddmot/dummy_training \ 
                                         --dataset_file bddmot 
                                         --coco_path /root/dataset/bdd100k/mot20_unzip/bdd100k/ 
                                         --with_box_refine  \
                                         --batch_size XX ??
                                         --lr ??
                                         --resume ???
                                         --lr_backbone ?? \
                                         --num_queries 500 
                                         --epochs 150 \
                                         --lr_drop 100

3. Where should I resume my training from? 
 Since bdd has more diverse classes than the crowded human dataset, I think it doesn't quite capture the vehicles and is biased towards pedestrians? 
Further, you trained your mot17_half directly as well (i.e. without crowdedhuman dataset) and got a good performance. To do that training you used Imagenet pre-trained model as your starting point? Is it possible for you to provide me that model? Maybe I can use that as the starting point to train on bdd. 

Thank you in advance! Will be happy to report my results here :) 
PeizeSun commented 2 years ago

Hi~ For multiclass object tracking, you should change models/tracker.py image

PeizeSun commented 2 years ago

The learning rate, batch size should be adjusted based on num_gpus. This codebase will automatically download Imagenet pre-trained model.

sfarkya04 commented 2 years ago

Thanks for replying @PeizeSun

I have made the changes you suggested. I am not clear about how it's automatically downloading the imagenet pre-trained weights.

I assumed that it must be happening here: tracking_code

But based on what you said I should not set resume param and the code will take care of downloading the imagenet default weights? My concern is that I didn't see any weights being downloaded and even if they are downloaded then not sure how are they loaded? Would be really helpful you could tell me that. Maybe it's a PyTorch thing I don't know about?

Thanks again for helping! :)

Nathan-Li123 commented 1 year ago

Hello! I'd like to confirm that if I want to train TransTrack for multi-object tracking, do I only need to modify tracker.py in the model code? Since the Tracker class is only used during inference, is it correct to say that I don't need to modify any training code except for the data loading part? Thank you in advance!