detectRecog / PointTrack

PointTrack (ECCV2020 ORAL): Segment as Points for Efficient Online Multi-Object Tracking and Segmentation
Other
261 stars 47 forks source link

Custom dataset #14

Open jong372 opened 3 years ago

jong372 commented 3 years ago

First of all, thanks for your major contributions to the multi object tracking and segmentation! I would like to apply PointTrack on a custom dataset. Nevertheless, I have some questions related to the implementation of it.

  1. Is it possible to implement this PointTrack algorithm on a custom dataset (in this case a dataset of apples, labelled according to the KITTI MOTS format)
  2. Is it possible to use a different image resolution (e.g. 1296*972)?
  3. I would like to train and test the architecture on my own dataset (2000 frames ~85000 masks). Nevertheless, I got confused by the ReadMe in the Github. First it describes the testing on the existing model and after that it deals with the Training of Pointtrack and the Training of SpatialEmbedding. What should be the order and procedure to use this repository for training and testing on a custom dataset?
  4. Is the training of SpatialEmbedding required? Or do you only need to train it on PointTrack? Hence, what is the added value of both to the detection and tracking performance.
  5. What is meant with 2.To generate the instance DB from videos (head: Training of PointTrack).

Thanks in advance!

lmanan commented 3 years ago

I speak only as a reader of the publication, the authors would be able to give a more informed answer:

  1. It should be possible to track on custom datasets.
  2. Different image resolution would also be possible. For the resolution specified (1296*972) , I think it would be fine. But for larger resolutions, you may have to adjust the xm and ym while training for segmentation here and here
  3. To train on a custom dataset, first you would need to train a segmentation model - this could be the SpatialEmbedding model used by the authors, which is inspired from the work of Neven et al but in principle, you could use any segmentation approach as the first step. The next step would be training a model to associate these segmentations for which you would need to train the PointTrack model.
  4. Yes, basically you need some segmentation approach before training PointTrack - those could also be generated in a non-learnt way if your objects are quite regularly shaped
  5. Prior to training PointTrack, the authors create a dictionary where they save properties such as global position of an expanded view of each object in an image, the image crop, the label crop etc. This allows them to query these crops later during training and sample points from each object - this is related to the training scheme which the authors suggest in the publication.

Let me know if there are more questions :+1:

NanH5837 commented 1 year ago

I speak only as a reader of the publication, the authors would be able to give a more informed answer:

  1. It should be possible to track on custom datasets.
  2. Different image resolution would also be possible. For the resolution specified (1296*972) , I think it would be fine. But for larger resolutions, you may have to adjust the xm and ym while training for segmentation here and here
  3. To train on a custom dataset, first you would need to train a segmentation model - this could be the SpatialEmbedding model used by the authors, which is inspired from the work of Neven et al but in principle, you could use any segmentation approach as the first step. The next step would be training a model to associate these segmentations for which you would need to train the PointTrack model.
  4. Yes, basically you need some segmentation approach before training PointTrack - those could also be generated in a non-learnt way if your objects are quite regularly shaped
  5. Prior to training PointTrack, the authors create a dictionary where they save properties such as global position of an expanded view of each object in an image, the image crop, the label crop etc. This allows them to query these crops later during training and sample points from each object - this is related to the training scheme which the authors suggest in the publication.

Let me know if there are more questions +1

Hi, do you have a way to visualize the tracking results?Thank you.