Kitware / dive

Media annotation and analysis tools for web and desktop. Get started at https://viame.kitware.com
https://kitware.github.io/dive
Apache License 2.0
83 stars 21 forks source link

[FEATURE] Support for semantic segmentation #1108

Open russelldj opened 2 years ago

russelldj commented 2 years ago

Is your feature request related to a problem? If so, Please describe. I have a semantic segmentation problem pretty similar to NOAA ADAPT where I'd like to iteratively annotate data and train a model. My understanding from poking around is there isn't yet support for semantic segmentation models (i.e. assign a class label to every pixel in the scene), but I could be mistaken. I was curious if this had been considered?

Describe the solution you'd like Supporting semantic segmentation on the VIAME side should be relatively straightforward, it would just require wrapping a segmentation model. I've been using mmsegmentation which is closely related to the mmdetection package that's wrapped already. My memory of the specifics are a bit fuzzy, but I think the kwcoco MultiPolygon that's already passed around for detections would be sufficient to handle segmentation as well, so integration with the normal VIAME pipelines should be straightforward. It's possible that I could contribute this to VIAME myself.

I'm guessing this change is much more disruptive on the front end. I think it's possible that you could represent segmentation as a collection of detections, but this seems clunky from a UI perspective. If this became a useful functionality, I think a dedicated segmentation mode would probably be useful.

Describe alternatives you've considered If I don't use VIAME, I'd probably use a standard labeling software such as CVAT or Labelme and locally convert these annotations to a useable format for model training.

mattdawkins commented 2 years ago

We have plans to do more background scene segmentation work as a part of one effort though it might not happen immediately

waxlamp commented 2 years ago

@subdavis, @BryonLewis, @marySalvi: it would be good to discuss how we can support different frontend workflow modes in general--perhaps yielding a methodology for supporting a wider array of annotation styles.

russelldj commented 2 years ago

Based on my initial experience performing this task in CVAT, it seems that a generic polygon annotator is largely sufficient. The one modification that is very useful is an ordering property. That way if you have two labels that overlap, the label is chosen inteligently. CVAT defaults to having the most recently-created annotation be chosen, i.e. it is "in front" of other annotations. Additionally, they support moving an annotation to the foreground or background.