MichiganCOG / ViP

Video Platform for Action Recognition and Object Detection in Pytorch
MIT License
219 stars 37 forks source link

Change DataLoader Collate Function #54

Open ehofesmann opened 4 years ago

ehofesmann commented 4 years ago

Currently for datasets with bounding boxes, we need to specify the max bounding boxes possible so all output batches are of the same size: https://github.com/MichiganCOG/ViP/blob/74776f2575bd5339ba39c784bbda4f04cc859add/datasets/ImageNetVID.py#L27

What we should do is use a custom collate function in the DataLoader like used in the Pytorch detection tutorial:

https://github.com/pytorch/vision/blob/6c2cda6a0eda4c835f96f18bb2b3be5043d96ad2/references/detection/utils.py#L237

https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

natlouis commented 4 years ago

Yeah collate is very useful for zero-padding your samples when there's a different number of objects (or lengths in other cases). But in my experience, it's specific to each problem. I'm not sure how it could generalize across all datasets if we add that to the DataLoader.