Dataloader for COCO json and corresponding imagery

rbavery commented 1 year ago

This should load annotations as arrays converted to tensors in a format that can be accepted by the oneformer transformers model. see https://github.com/developmentseed/slickformer/blob/main/notebooks/evaluation_mrcnn.ipynb for an initial example of this where the annotations are loaded as numpy arrays. For speed, I think we can test out the encode method of the image processor and encode these masks as semantic segmentation maps. Later we can properly treat these instance segments with a custom processor.

https://huggingface.co/docs/transformers/model_doc/oneformer#transformers.OneFormerImageProcessor.encode_inputs

cc @srmsoumya

rbavery commented 1 year ago

this is almost done with https://github.com/developmentseed/slickformer/pull/7, remaining to-dos are:

test this out by plotting, make sure crops look good add step to explicitly normalize images add step to format list of mask arrays into input expected by OneFormer handle mask labels in another datapipe step before random split is applied so that we can adjust the class hierarchy to the simple 3 class form

rbavery commented 1 year ago

@srmsoumya still to-do here is

[ ] add step to format list of mask arrays into input expected by OneFormer

before working on the trainer, I think we first would need to add a step to the labels datapipe that is used in this notebook so that mask inputs are returned for https://huggingface.co/docs/transformers/model_doc/oneformer#transformers.OneFormerImageProcessor.encode_inputs

developmentseed / slickformer

Dataloader for COCO json and corresponding imagery #6