How does it explain in segmentation?

blackfeather-wang / ISDA-for-Deep-Networks

An efficient implicit semantic augmentation method, complementary to existing non-semantic techniques.

582 stars 93 forks source link

How does it explain in segmentation? #6

Closed Harryoung closed 2 years ago

Harryoung commented 3 years ago

Wonderful work! But still left me with a question. It does make sense in image classification task, as the main object in the image can have semantic transforms. But in segmentation task, is ISDA operated based on each pixel? How to explain semantic transform of an alone pixel?

blackfeather-wang commented 3 years ago

Thank you for your attention to our work! Indeed, ISDA operates on each pixel in semantic segmentation.

As you noted, ISDA is mainly derived based on the basic classification task, where the semantic transformation may be easily visualized in the pixel space, and hence the motivation is clear.

However, not that straightforward as it may seem, we still believe that the same semantic transformation also exists for each pixel in dense prediction tasks. For example, the pixel from the sky may be affected by different weathers, the pixel from the cars may be affected by the type of cars, the pixel from persons may be affected by the clothes or visual angles, etc. Perhaps ISDA could be improved for segmentation tasks by considering the correlation between adjacent pixels. We may focus on this in the future.

Harryoung commented 3 years ago

Yes! You are right! I agree with you that ISDA could be improved in segmentation task by considering the adjacent pixels. It is impressed that ISDA could bring up around 1% in cityscapes by just operating on separate pixels! By the way, just minutes ago, I notice that ISDA is actually operated on a small window corresponding to the receptive field instead of just one pixel. So it makes some sense because the small "patches" do have some meaningful semantic transforms. And is this the way to consider the adjacent pixels? HaHa.

blackfeather-wang commented 3 years ago

Thank you for these helpful comments!

I think you are right. Due to the down-sample operations in CNNs, ISDA actually operates on small patches (e.g., 8x8, 16x16) of the original images. :)

We believe that the semantic augmentation technique like ISDA has the potential to be a powerful complement to existing explicit DA methods. Maybe there will be some better solutions for specific tasks like segmentation, detection, etc. 😄

Harryoung commented 3 years ago

I have recommended your excellent work to my friends and prepared to apply ISDA to my own projects. Really impressed by the beautiful math and equations! Looking forward to your future work! 😄

lihuikenny commented 3 years ago

I have recommended your excellent work to my friends and prepared to apply ISDA to my own projects. Really impressed by the beautiful math and equations! Looking forward to your future work! 😄

hi， Is this work effective for feature extraction?