0bserver07 / One-Hundred-Layers-Tiramisu

Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)
https://arxiv.org/abs/1611.09326
MIT License
197 stars 54 forks source link

Collaborate + Submit to keras-contrib + see Keras-FCN #5

Open ahundt opened 7 years ago

ahundt commented 7 years ago

Might you be interested in a pull request of this code to the official keras-contrib repository which is the upstream source for Keras, and has a DensenetFCN implementation?

These keras-contrib issues are also relevant to this repository:

Keras-FCN, which I was planning to adapt for a merge into keras-contrib also has a SegDataGenerator implementation with some of the features you are looking for in your comments, plus additional models and experimental support for coco.

I figured it might be worth collaborating because it appears we are working on the same thing (training DenseNetFCN), and both running into the same accuracy limitations even with independent implementations.

0bserver07 commented 7 years ago
  1. Hey! Yes, I would be happy to Collab, thanks for bringing these to my attention, I'm not quite familiar with this repo, by Upstream you mean that it will be merged into Keras sometime soon?

    1. and Yes it seems we both are going on that same path of DenseNets plus or minus something, meanwhile facing same issues.

I will look into this / your comments, let me know if there is one specific one that's hot-burning I can look at for you also.

See I realized that in terms of the model, my previous implementation of the SegNet with further complexity can do wonders.

However, at the meantime, I'm implementing Mask R-CNN, but I want to replicate the result this time around 😄 (hopefully)

I'm running into 2 problems, 1. I can't fit the model into memory 😄 (I think switching backend to TF all the sudden is causing some trouble) and 2ndly, I don't know how to create this custom ROI-Align thingy from the paper, which is aligning 2 tasks Pixel-wise loss.

All in all, if that works out :), I can guarantee better results for same tasks.

PS: I just looked at (https://github.com/farizrahman4u/keras-contrib/issues/63)

  1. I tried some crazy Hyper-Params and some of them did interesting stuff.
    • SGD with Cyclical weight decay, like it goes down down down and then back up.
    • Adding more augmentation to the Training set and reducing regularization
    • I tried Adam, anything other the RMSProp was better.
  2. I realized that the authors run one of there models for 750 Epochs! 0.o , I mean, I ran one for 350 Epochs, but the gradient just turns into an empty tuna can, nothing but smells!

OH and also, the paper has a little tiny mistake on the Diagram, keep that in mind for calculating the Param m, which the growth rate.

ahundt commented 7 years ago

keras-contrib is where new functionality now goes for Keras until it is ready for prime time: https://github.com/fchollet/keras/blob/master/CONTRIBUTING.md#pull-requests

Kept replies numbered below so we can refer back to them, the best version of the DenseNetFCN model code is in ahundt/keras-contrib with the densenet-atrous branch, and Keras-FCN.

The most hot burning of any item is (4), since I've got evidence it works in Keras-FCN with ResNets, but this is not DenseNetFCN specific. I'd say second most burning which is specific for the tiramisu DenseNetFCN network might be (6a) + (1) which are both easy steps.

  1. NADAM one seems to be a concrete improvement to Adam, I might try it out if there are better hyperparameters as they mentioned in there https://github.com/tensorflow/tensorflow/pull/9175. You're definitely right that they don't solve all the world's problems. :-)
  2. That's a lot of epochs! How could they make progress for that long?
  3. That mistake has been accounted for in the linked DenseNetFCN
  4. Pretrained DenseNet ImageNet weights + Atrous convolutions look like they might be a solid non-tiramisu approach since it worked for resnet 50. I mention this in https://github.com/farizrahman4u/keras-contrib/issues/63 and have an implementation in https://github.com/aurora95/Keras-FCN/blob/master/models.py#L235, that would require transferring original densenet imagenet weights or training from scratch.
  5. A larger/better dataset always helps, I've been working on that with coco in Keras-FCN, I think tweaking that to work could make a huge difference. (5a) One peculiarity that still need to be resolved is a single pixel can be in multiple classes, I was thinking of changing the output to be single class, but add an option for one-hot encoding so categorical-crossentropy will give more credit for one match in any category. This involves reasonably small changes in Keras-FCN. (5b) I was thinking of going to loading the segmentation from the masks directly from pycocotools rather than the files like this loop without numpy.save
  6. Other datasets that might also be good options and are easily integrated via what I've already implemented for other datasets as per the Keras-FCN dataset instructions:
  7. What page/column of the paper is the ROI thing you mention?
ahundt commented 7 years ago
  1. also worth noting is https://github.com/nicolov/segmentation_keras
0bserver07 commented 7 years ago

A lot of good stuff here, i will get to it tonight : )

Fahim-F commented 7 years ago

Hello, Excuse me, I want to know about the file "fc-densenet-model.py", Does it work or not??

ahundt commented 7 years ago

I know this one does https://github.com/farizrahman4u/keras-contrib/blob/master/keras_contrib/applications/densenet.py

Fahim-F commented 7 years ago

Thanks :)