hq-jiang / instance-segmentation-with-discriminative-loss-tensorflow

Tensorflow implementation of "Semantic Instance Segmentation with a Discriminative Loss Function"
MIT License
170 stars 47 forks source link

pretrained model for semantic segmentation #1

Closed tmquan closed 6 years ago

tmquan commented 6 years ago

Hi @hq-jiang,

Nice work for discriminative loss. I am trying to use your code for other data such as cityscapes or CVPPP (leaf segmentation) as discussed in the original paper. I would like to ask whether you can release code model for semantic segmentation as well? I would like to train from scratch in those data and therefore a separate training for semantic mask is required.

Bests,

hq-jiang commented 6 years ago

Hi Tran Minh,

Thank you for your interest. Unfortunately, you discovered my code before I wanted it to show to the outside. To be up front, sorry for the bad documentation. Let me have a look at the semantic segmentation code. Since I trained it on AWS and deleted the instance, the actual refactored version is lost. I will work on it over the weekend, so expect some updates. If you see anything, I am happy to receive pull requests.

Best regards Han

2018-05-18 21:12 GMT+02:00 Tran Minh Quan notifications@github.com:

Hi @hq-jiang https://github.com/hq-jiang,

Nice work for discriminative loss. I am trying to use your code for other data such as cityscapes or CVPPP (leaf segmentation) as discussed in the original paper. I would like to ask whether you can release code model for semantic segmentation as well? I would like to train from scratch in those data and therefore a separate training for semantic mask is required.

Bests,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hq-jiang/instance-segmentation-with-discriminative-loss-tensorflow/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AXHRZfpmXD-H12BJe49ofdBiIg1PHcxvks5tzx0AgaJpZM4UFOLW .

tmquan commented 6 years ago

Hi @hq-jiang I would like to revisit this issue. I have successfully loaded your pretrained model on semantic segmentation to train further the discriminative loss. However, when I visualize them on tensorboard it is quite messy and I am totally lost.

May I ask that where the semantic prediction is placed in the Enet? Before or after the high-dimensional-feature (prediction) in your implementation. In other words, what are the last_prelu and prediction standing for? Are they corresponding to semantic segmentation (12 classes) and high dimensional feature (3 in your case)?

For the clustering, is it necessary to mask out the high-dimensional feature using semantic prediction before passing to mean-shift clustering algorithm, which has only one parameter bandwidth that can be tuned later?

In your opinion, is it a good way for us to train semantic segmentation simultaneously with high-dimensional prediction by the discriminative loss?

Thanks,

hq-jiang commented 6 years ago

Hi @tmquan,

I have uploaded my raw semantic segmentation code for reference. You might need to change things to make it work, I am a little bit busy right now and I would need to set up a new AWS instance to test it.

Regarding your questions:

  1. The last_prelu is the name defined by enet.py. It refers to the output of bottlenet5.1 (see image). prediction which has the name scope Instance/transfer_layer/conv2d_transpose is the replacement of fullconv (see image). So you are right last_prelu should have the dimension of 12 in your case and 3 high dimensional features. Attention: My implementation works for a binary class problem (lane marking or lane marking). For a multi-class multi instance problem you might need to adapt my implementation:

In contrast to the CVPPP dataset, Cityscapes is a multi-class instance segmentation challenge. Therefore, we run our loss function independently on every semantic class, so that instances belonging to the same class are far apart in feature space, whereas instances from different classes can occupy the same space. For example, the cluster centers of a pedestrian and a car that appear in the same image are not pushed away from each other.
Semantic Instance Segmentation with a Discriminative Loss Function

enet

  1. The bandwidth parameter can be tuned once you finished training.

  2. The authors in Fast Scene Understanding for Autonomous Driving trained 3 networks simultaneously and found that it improves overall accuracy.

tmquan commented 6 years ago

So sorry for bothering you a lot.

May I recap a very last question: For the clustering, is it necessary to mask out the high-dimensional feature using semantic prediction before passing to mean-shift clustering algorithm?

Thank you very much.

hq-jiang commented 6 years ago

Ah, sorry I missed that part. Yes, you are right, you need to seed your instances with semantic segmentation. In my case that was not necessary, because the biggest instance is always the background