Updates for 100-Layer Tiramisu

Hi Brendan,

Thanks for providing this implementation! I've found it very useful when getting into segmentation.

Like you, I was attempting to reproduce the baseline from the 100-Layer Tiramisu paper. My best results on FCDenseNet67 (which should be reproduceable from the train.py script I've provided) were SGD Test - Loss: 0.3137 | Acc: 0.9060 | IOU: 0.6158 using the following modifications to your codebase:

Changing densenet dropout layers from dropout2d to dropout
Using 11 layers for the loss function and having a masked loss like the authors' loss function
Changing to SGD for the optimizer rather than RMSProp (this might be the most important)
Using torch.functional.transforms for the joint transforms
Not normalizing the input images by the dataset mean, and not using weights for the classes. There's no place where either are used in the 100-Layer Tiramisu repo.
Edited: Used an explicit dataloader for fine tuning.

These are the train and validation losses coming from the same run: Train - Loss: 0.1287, Acc: 0.9557 Val - Loss: 0.1740 | Acc: 0.9465 | IOU: 0.7219

Finally, the command I used for this result was: train.py --data_path [data_path] --model FCDenseNet67 --epochs 850 --optimizer SGD --lr_init 1e-2 --lr_decay 1 --batch_size 4 --ft_start 750 --ft_batch_size 1 --dir [dir]

Let me know if you have any concerns with my changes, as I thought I should send a PR for them so that it might be easier for other people to reproduce the paper in the future.

Wesley

bfortuner / pytorch_tiramisu

Updates for 100-Layer Tiramisu #17