Closed mgpadalkar closed 2 years ago
Hey @mgpadalkar , thanks for your interest in our work :).
The images on the poster were from a project demo at that time.
If I recall correctly it wasn't anything fancy for this specific image, just a proof of concept (we did other more advanced stuff in later follow-ups). A window size that is on the scale of what is being used in training will do, so something like 128 or 256 (or 224 if one wants to be exact, but I believe the exact integer didn't matter so much). The stride would of course ideally be 1, but to use up a little less compute you can just set it to something like 10% of the window. As long as there is enough overlap, it will do more or less fine. Depends also on how granular you'd like the result to be.
Sounds super computationally heavy at first, but remember that each window is independent, so you can just "mini-batch" process a big image by chopping it up according to the window positions and then stitching/averaging the results back together after processing most of them in parallel as an input to the model.
Hi @MrtnMndt,
Thanks for a quick response :smiley: I had assumed it to be similar to what you have said, but your comment makes it clearer. Thanks for the help! :+1:
Hi @MrtnMndt,
Thanks for this work! I wanted to know the window size and stride used to obtain the semantic segmentation results as shown in the poster https://github.com/MrtnMndt/meta-learning-CODEBRIM/blob/master/imgs/CVPR19_CODEBRIM_poster.jpg.