missing skip connections?

0bserver07 / One-Hundred-Layers-Tiramisu

Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

https://arxiv.org/abs/1611.09326

MIT License

197 stars 54 forks source link

missing skip connections? #6

Open ahundt opened 7 years ago

ahundt commented 7 years ago

I'm not sure if I'm reading your code right but is it perhaps missing the actual skip connections?

https://github.com/0bserver07/One-Hundred-Layers-Tiramisu/blob/master/model-tiramasu-103.py

I'd expect to see a line similar to the following: https://github.com/farizrahman4u/keras-contrib/blob/master/keras_contrib/applications/densenet.py#L614

Dref360 commented 7 years ago

Yeah I see the same thing. Using the functional API would make it simpler as well. Extracting a class Tiramisu would make the code cleaner as well. Since only the create function is changing.

0bserver07 commented 7 years ago

Oh wow thanks for bringing this to my attention, that looks like a horrific mistake from me. While I changed out to Keras2 API, I removed the Merge / Concat layer and never added back again.

Ironically, I used this style to avoid missing anything out of the model! (few days of training and tweaking the wrong model not working 😄 )

I will get to it

0bserver07 commented 7 years ago

@Dref360 thanks, good tip, I will definitely try using the Functional API. Yeah the reason I didn't create a class separately at start was due to my problem understanding the growth factor = m, in the original paper, which still seems to be incorrect, something weird happens where it doesn't grow by 16 in the middle bottle-neck, if m = 16.

alt text

ahundt commented 7 years ago

Perhaps it is worth considering settling on the models in https://github.com/farizrahman4u/keras-contrib/blob/master/keras_contrib/applications/densenet.py and https://github.com/aurora95/Keras-FCN/blob/master/models.py#L288 (this second one calls the keras-contrib version, and just tweaks a couple parameters & the top). They use code which several people have contributed to and which have been tested on non public datasets by the primary author of that code as per some notes in keras-contrib issues.

0bserver07 commented 7 years ago

Thanks, for the references, I will definitely, look at them when I got a little more time.

Though, last night I created this version of the model (https://github.com/0bserver07/One-Hundred-Layers-Tiramisu/blob/fc-dense-with-func-api/model-tiramasu-67-func-api.py)

This time the skip connections are there: (https://github.com/0bserver07/One-Hundred-Layers-Tiramisu/blob/fc-dense-with-func-api/model-tiramasu-67-func-api.py#L110)

But, I'm still not quite sure if there is something missing, I will re-run this with a couple of tests that I had in mind

0bserver07 commented 7 years ago

New Stuff: (https://github.com/0bserver07/One-Hundred-Layers-Tiramisu/blob/master/model-tiramasu-67-func-api.py)

Close enough?

ahundt commented 7 years ago

Is this the one in the readme that gets accuracy between 0.6 and 0.7? If so it seems pretty different from the paper still I wonder what's up.

0bserver07 commented 7 years ago

Yes it's ! Oh interesting wait whay part is different? I feel like my code doing the skips is not getting there :/

Dref360 commented 7 years ago

Could it be the batch_size? They use 3 in the paper. Also, they are doing a fully Convolutional Network.

You could get more info here

scholltan commented 7 years ago

did someone achieve the results shown in the paper?

0bserver07 commented 7 years ago

The updated branch that is not master has the some-what correct version of the implementation (still missing the L layers of Skips after each block).

Even though I didn't get the numbers shown in the paper, I tried it on a private dataset and it works great.

My personal take, it's not worse the training time, finding a model that performs SOTA for a specific task seems to be better.

scholltan commented 7 years ago

@0bserver07 maybe use the resnet, it will works better?

ahundt commented 7 years ago

The problem with the training time is due to concat allocating too much so batch sizes must remain tiny, see https://github.com/gpleiss/efficient_densenet_pytorch for details.

I made a feature request regarding the problem at: https://github.com/tensorflow/tensorflow/issues/12948

csjfwang commented 7 years ago

Hi @0bserver07 ,

It's good to see that fc-densenet performs well on your own dataset. Also , i'm eager to try fc-densenet on CamVid and my own dataset, but i met many problems. So, i hope you can give more details about your training and testing process.

Thanks a lot!!

ohernpaul commented 6 years ago

I have a working functional model with skips if anyone wants me to push it.

Edit: with IoU loss implementation.

PhilippKopp commented 6 years ago

Hi @ohernpaul , I would be very interested in the functional model. Could you please make it available? With IoU implementation would be awesome!