Closed tmabraham closed 4 years ago
Hi Tanishq, yes you're right that the reason a file is skipped usually means it didn't detect any tissue content and therefore no patches are extracted following segmentation (as indicated by the fact that the terminal output does not produce an error). This is most frequently caused by either:
You might want to look at the segmentation outputs produced, which should be saved in the same folder as the patches and stitches to figure out what the segmentation looks like and if this happens for rare cases in your dataset, in which case you can follow the guide to tune for segmentation/patching parameters for individual slides, without having to reprocess the entire batch. Otherwise if you notice the segmentation is unsatisfactory for most of your data, you might consider setting a different set of paramters globally.
ps. this is why I recommend in the guide to first go through the dataset once with only --seg enabled, this will go through your dataset and just save the segmentation masks, which takes minimal time, and you'll get a chance to see if anything major needs to be tweaked and edit parameters if needed, before running through the whole dataset with patching and stitching.
@fedshyvana Thanks for the suggestions. I fine-tuned the parameters a little bit on a small subset of the dataset and then started creating the segmentation masks, but then it reached a slide that was empty and raised an error.
Is it possible to indicate slides to skip?
Even better, could the program automatically skip through slides if there is an error and return the list of skipped slides?
Never mind, it seems like the process_list csv will allow me to skip certain files by setting it to 0?
@tmabraham yes that is correct you can skip files by setting them to 0 in your edited csv file. Just make sure you save the csv file as a copy (or rename it) and pass it as an argument when running the script. Regarding skipping files automatically when certain criteria are met, and generating logs regarding their status, I think they're reasonable features that should not be difficult to implement. When I have time I will try to add them in future updates.
@fedshyvana If you haven't been able to add the automatic file-skipping, I have a basic implementation for this and I might be able to open a pull request sometime next week.
I tried to run the following command:
It processes the first image fine, but then I get the following error when processing the second image:
This seems to indicate that it is skipping a file and not creating patches for a file. Indeed, when I remove the stitching option, I can see the program is skipping through many images. I get terminal output like this:
Whereas for a properly processed image (without stitching option) I get:
Here is an example image (plt.imshow) file in question:
Why is the program skipping images? My understanding is that right now, you use simple binary thresholding, correct? Is it possible that the threshold is not correct for my dataset?