Checkpoints for evaluation on new test sets

psandovalsegura commented 8 months ago

Is there a checkpoint I can use off-the-shelf to evaluate on a new car tracking test set I have?

In other words, do you have a checkpoint you expect to work well on new datasets? Or is it recommended to train my own model on car tracking training data?

Thanks for your help. ARTrack is an interesting approach!

ARTrackV2 commented 8 months ago

I think it's possible to use https://drive.google.com/drive/folders/1KsH_MIZIdgjZpUZBmR4P88yeYDqM8yNW?usp=sharing which presents ARTrack-B{256} trained on COCO GOT10K LaSOT and TrackingNet. The SOT is a general challenge which is a class-ignostic task, so I think an ARTrack-B{256} is compatible with a new test.

Moreover, you can trained our ARTrack on your new test sets, but do not eliminate the COCO et. datasets in training.

psandovalsegura commented 8 months ago

Thank you for the information. I will try the checkpoint you provided.

psandovalsegura commented 7 months ago

Hi Yifan,

I have setup the ARTrackSeq_ep60..pth.tar checkpoint and have created a new dataset class in lib/test/evaluation. However, the results on my driving dataset look wrong. I think this might be due to the frame resolution I'm using?

My frames are 960 W x 540 H. Would that be a problem? How do you recommend I modify the code so that I can run inference on these frames?

AlexDotHam commented 7 months ago

I think the resolution may not be the reason, in GOT10k, there are diverse resolution formats. And following the pytracking, the search and template will be cropped from frames and resize into specific resolution. If you can show me the results you provide in your own datasets, I can give you some suggestions about the reason.

psandovalsegura commented 7 months ago

I see. In that case, it must be my subclass of BaseDataset. Can you explain what should be in each Sequence object? Right now I am only passing in ground_truth_rect as an np.array of shape (N, 4), where every row is [x_min, y_min, x_max, y_max], where it is top left corner of bbox followed by bottom right corner. I visualized my template image (z_patch_arr) from the first frame and it is not correct.

This is what I am currently doing:

Sequence(name=sequence_name, 
                        frames=frames_files, # paths to N .jpg frames
                        dataset='ds-name', 
                        ground_truth_rect=ground_truth_rect) # (N, 4) np array where every row is [x_min, y_min, x_max, y_max]
                        # object_ids=track_ids, # not using multiple object ids yet
                        # multiobj_mode=False)# is multiobj mode supported? None of the sample datasets use this

In other words, more documentation for every param of Sequence would be very helpful. In particular, how to structure ground_truth_rect. Thank you!

AlexDotHam commented 7 months ago

The ground_truth_rect maybe the np.array of (N, 4) but the every row is [x_min, y_min, w, h] or [center_x, center_y, w, h]. You can try this, and I am not sure about that param, but I am sure that the box is not [x_min, y_min, x_max, y_max].

psandovalsegura commented 7 months ago

That fixed it! The results look reasonable now. I was using [x_min, y_min, x_max, y_max] since that was mentioned in Section 3.1 of paper. But [top left x, top left y, w, h] worked. Thank you.

One more question: how does multiobj_mode change the way ground_truth_rect should be structured? My video has some object ids that only appear after many frames later and disappear. And in the code it seems the type should be (dict, OrderedDict) but the code doesn't explain what it represents.

AlexDotHam commented 7 months ago

I am so sorry i don't know about that, but I think it is useful to reference GMOT(https://arxiv.org/abs/2212.11920), this paper present a dataset like u promote, the code is presented in https://github.com/visionml/pytracking.

psandovalsegura commented 7 months ago

I'll check out pytracking. Thanks!

MIV-XJTU / ARTrack

Checkpoints for evaluation on new test sets #37