Open rbavery opened 1 year ago
this is almost done with https://github.com/developmentseed/slickformer/pull/7, remaining to-dos are:
test this out by plotting, make sure crops look good add step to explicitly normalize images add step to format list of mask arrays into input expected by OneFormer handle mask labels in another datapipe step before random split is applied so that we can adjust the class hierarchy to the simple 3 class form
@srmsoumya still to-do here is
before working on the trainer, I think we first would need to add a step to the labels datapipe that is used in this notebook so that mask inputs are returned for https://huggingface.co/docs/transformers/model_doc/oneformer#transformers.OneFormerImageProcessor.encode_inputs
This should load annotations as arrays converted to tensors in a format that can be accepted by the oneformer transformers model. see https://github.com/developmentseed/slickformer/blob/main/notebooks/evaluation_mrcnn.ipynb for an initial example of this where the annotations are loaded as numpy arrays. For speed, I think we can test out the encode method of the image processor and encode these masks as semantic segmentation maps. Later we can properly treat these instance segments with a custom processor.
https://huggingface.co/docs/transformers/model_doc/oneformer#transformers.OneFormerImageProcessor.encode_inputs
cc @srmsoumya