Closed Williamongh closed 5 years ago
Thanks for your question!
First I'd like to express my apology for my foolish naming approach and weak doc. The model and all the codes are for a competition which gave us a set of MR Images and the corresponding masks, so the model maybe don't work well in other datasets. And all mentioned tests are done in the given dataset.
If you look back to the original ResNet paper, there are 5 downsampling layers in ResNet-50 and the first one is a convolution with stride of 2. In my opinion, this is also an encoder block but I didn't point it out in my code. So actually there are 4 downsampling layers (encoder block) in my model.
Also I have realized that this kind of encoder maybe waste the feature map resolution, so I have tried to use the last 4 downsampling layers in ResNet-50. Well, it use much more memory and the result even worse. And I have tried my model with all 5 downsampling layers in ResNet-50, the result is close to the uploaded version with more memory.
As for how to decide the encoder/decoder pair numbers, I think it depends on the task you tried to work on (you can use different number of layers and compare the results). I also think https://github.com/MrGiovanni/UNetPlusPlus will help. His model is able to use the results of u-nets with different depth and he has made a clear explanation in his slides.
I may translate all my slides and docs into English and upload them to github if I am free from work.
Best Regards.
Thanks for your particular and detailed reply!
And I apologize for raising this stupid issue because of my lack of experience. I've got your point. I'll try to modify the model to fit my task.
Good luck with the competition! Best Regards.
Thanks for sharing! I wonder why use 3 encoders followed by 3 decoders? AFAIK, people usually use 4 encoder blocks and 4 decoder blocks to construct an U-net. May I ask what your consideration is? Best regards.