jlevy44 / PathFlowAI

A High-Throughput Workflow for Preprocessing, Deep Learning Analytics and Interpretation in Digital Pathology
https://jlevy44.github.io/PathFlowAI/
MIT License
39 stars 8 forks source link

Sparse vs dense annotation #20

Closed asmagen closed 4 years ago

asmagen commented 4 years ago

My pathologist performed initial multi-class manual annotation via QuPath of 5 slides, selecting a few areas of interest with good quality of each category I wanted to characterize (basically prioritizing quality over quantity). I was assuming a more comprehensive annotation is needed for training, prioritizing quantity over quality. Please see a low res example of an annotation he made. I know that the optimal annotation strategy and required slide quantity isn't known but what is the current strategy too sparse? Is the algorithm ignoring the non annotated regions or considering them as background such that missing an annotation of tumor areas for example is going to decrease the accuracy?

Thanks

jlevy44 commented 4 years ago

Yeah what you have may be on the border of a bit too sparse and as such you’ll likely need far more annotations of the minority class. In any case, over sampling and class weighting functions offered through PFAI help improve the recall of those regions at the expense of precision (overcalling the minority class where it should not be)

jlevy44 commented 4 years ago

Also there are a lot of technical artifacts in this tissue that may decrease performance. One method to circumvent is to eliminate patches that have been annotated as artifacts, another is detecting these artifacts, which although can be done with the tools offered in this package, is not the primary focus

asmagen commented 4 years ago

Thanks. But is the algorithm ignoring the non annotated regions or considering them as background such that missing an annotation of tumor areas for example is going to decrease the accuracy? And regarding the artifacts, we're adding an artifact category where the various artifact types will map to that single category. Would that work well rather than eliminating them from training? Such elimination seems suboptimal in my opinion because it'll result in removing valuable training regions containing a mix of artifact and real patterns.

jlevy44 commented 4 years ago

Yes, if you are going to annotate it, you'd want to make sure that the annotations correspond to what you are aiming to predict. Quality does play a role, but predictions can still be made in less than ideal circumstances. We have not explored the effect of annotation quality through direct experimentation, but has been an observation of ours and others.

Re artifacts: seems reasonable, however, we have not tested this framework in capturing tissue artifacts and would encourage you to explore other frameworks for artifact removal, I can point out a few.

asmagen commented 4 years ago

Regarding ASAP annotation tool, did you install it on a Mac? The authors of the package don't support Mac installation advice so I was wondering how do you install and use it?

sumanthratna commented 4 years ago

It looks like you should be able to build it with cmake and GCC: https://github.com/computationalpathologygroup/ASAP#compilation