KevinMenden / scaden

Deep Learning based cell composition analysis with Scaden.
https://scaden.readthedocs.io
MIT License
71 stars 25 forks source link

Scale model? #92

Closed khkk378 closed 3 years ago

khkk378 commented 3 years ago

I need a way to assess the biological reliability of the estimates. One way I was thinking of was to include a bunch of cell types that I know aren't part of my tissue of interest, and then use those estimates as a metric for reliability. That would include training on 100-200 samples, with maybe 100 cell types in total. Do you think I would need to scale the networks for that?

KevinMenden commented 3 years ago

Hi @khkk378 ,

interesting idea - I'm not sure though whether that will give you the reliability estimates that you want. 100-200 samples will be too small of a training dataset (but maybe you mean something different). I never tried to estimate more than say 10-15 different celltypes, then it gets very tricky. 100 different celltypes is extremely challenging and I have the feeling that you would get rather nonsense results from that :-)

Regarding scaling, the networks should be expressive enough to deal with that. Nevertheless, I don't think that would work.

I know that missing uncertainty estimates are a major drawback of Scaden currently, and I have planned to include something like this soon. If you're interested, the easiest way of including that now would be to run the different Scaden models with dropout enabled during prediction time for say 100 times and then average the results. The standard deviation of those results would give you some uncertainty estimate.

Let me know if you're interested to try this out, I wanted to test it too at some point. Turning this into a discussion.