Effect of cell type abundance on AUC values

neurorestore / Augur

Cell type prioritization in single-cell data

MIT License

100 stars 11 forks source link

Hi @hayfre - without knowing more about your particular dataset I can only speak in generalities, but my intuition would be that if you can rule out a biological effect, there may be a significant technical effect affecting this population (for instance - cells of this type from one of your libraries are stressed/dying). If there are only a few cells of this type, these cells would be present in every subsample and would make the two conditions easier for the RF to separate.

This is just one potential explanation, but you could experiment with changing the subsample size and see if your results are stable - we found they generally were (Supp. Figs. 6 and 10 in the Augur paper) but it may be the case that the AUC for your small cell population is more sensitive. Only thing I would suggest is if you are going to lower the subsample size you may want to increase the number of subsamples to give the prioritization a better chance to 'converge'.

neurorestore / Augur

Effect of cell type abundance on AUC values #19