satijalab / azimuth

A Shiny web app for mapping datasets using Seurat v4
https://satijalab.org/azimuth
GNU General Public License v3.0
108 stars 31 forks source link

Inconsistency between annotation levels #176

Open viktorfeketa opened 1 year ago

viktorfeketa commented 1 year ago

I annotated my dataset using Azimuth reference (mouse motor cortex). I was very happy to see that each cell in my dataset now has 3 levels of cell type/cluster annotation! (Thank you for a great tool!)

However, there are inconsistencies between different annotation levels. For example, in my dataset, 39413 cells received the "predicted.subclass" label of "Astro". However, if I now summarize the "predicted.class" labels of these "astrocytes", the result is "GABAergic: 233"; "Glutamatergic: 1485"; "Non-neuronal: 37695". In other words, many cells receive top-level "predicted.class" label of a neuron (either GABAergic or Glutamatergic), but second-level "predicted.subclass" label of an astrocyte. The same problem happens with many cells on different levels. Hierarchical levels of "class"/"subclass"/"cluster" do not maintain nested hierarchy, they seem to be independent of each other.

I can kind of see how this can happen computationally, but this doesn't make biological sense. Is this the expected behavior of the algorithm? If yes, do you think maybe changes should be made to prevent this to better align with the biological reality?

Reference: https://zenodo.org/record/4546935