eto-ai / rikai

Parquet-based ML data format optimized for working with unstructured data
https://rikai.readthedocs.io/en/latest/
Apache License 2.0
138 stars 19 forks source link

Allow pytorch model to customize collate_fn to be used in torch DataLoader #530

Closed eddyxu closed 2 years ago

eddyxu commented 2 years ago

All the official pytorch object detections model expect the input as a list of tensors List[torch.Tensor[3, H, W]], while the classification models expect a mini-batch of stacked tensors torch.Tensor[N, 3, H, W].

Object detections model then can do resize in GPU and tolerate different size / channels of images.

We could probably expect that the none-official models will have a wide range of expectations from collate_fn. Rikai should be flexible about that.

eddyxu commented 2 years ago

By default, Torch DataLoader will throw error with SSD + the full coco dataset even we apply Resize(300, 300) to the images.

eddyxu commented 2 years ago

Closed via #643