Semantic segmentation vs Semantic thing / class

nikmo33 commented 2 years ago

Currently looking at the use of semantic segmentation - there seems to not really be a good use case to split the segmentation into thing / stuff classes. And Given the assumption that a 3d point can only belong to a single semantic class this should always be true? I was looking into integrating semantic nerf into my use case and find that the splitting into things / stuff seems necessary and adds added complexity, whereas I would like to just load the semantic labels as is and have a single head to perform semantic segmentation. It could add an additional head on top of thing/stuff, but this seems un-necessary and adds more complexity to the code. What do people think about changing the structure of the Semantics class to the following -

@dataclass
class Semantics:
    """Dataclass for semantic labels."""

    filenames: List[Path]
    """filenames to load "stuff"/background data"""
    classes: List[str]
    """class labels for semantic data"""
    colors: torch.Tensor
    """color mapping for semantic classes"""
    is_thing: List[bool] = []
    """Optional flag to indicate if the class is a thing - Can be used for usecases that require this? """
   class_loss_weights: List[float] = []
    """Loss weighting for each class, if empty then uniform weighting is assumed"""

This should be flexible enough to support the current use case and allows simplifying the model code quite a bit? Im happy to self-assign this to myself as well if there is some consensus on this design decision. Thanks!

nikmo33 commented 2 years ago

I have outlined a proposal in this PR - https://github.com/nerfstudio-project/nerfstudio/pull/886 . This is without the is_thing and loss_weight above, but can be added in pretty trivially if required

ethanweber commented 2 years ago

Hey @nikmo33, I just replied to your PR! We can discuss there. 🙂 I don't think we need the is_thing and class_loss_weights, but it would be nice to have an arbitrary number of things returned from the InputDataset. Happy to work together on this.

nerfstudio-project / nerfstudio

Semantic segmentation vs Semantic thing / class #880