brain-bican / taxonomy-development-tools

Tools to build and edit Cell Annotation Schema taxonomies.
Apache License 2.0
3 stars 1 forks source link

Annotation transfer user stories #156

Open dosumis opened 3 months ago

dosumis commented 3 months ago

As a biologist/bioinformatician, I want to view annotation transfer from another dataset/taxonomy at the cell or cell-set level in order to understand how it supports current annotation or how it might support changes to the current annotation of the taxonomy I am viewing/editing. To support this, users should be able to view:

CAS representation:

TDT functionality required:

dosumis commented 3 months ago

TBA: User stories around importing annotation transfer - from MapMyCells & from custom CSV.

@AvolaAmg @hkir-dev - please review

dosumis commented 3 months ago

CC @jeremymiller

AvolaAmg commented 3 months ago

Please see a potential user story attached with MapMyCells and custom .csv as examples let me know how to implement the story. In the future, I can include screenshots of the output files and potentially screenshot of how the annotation transfer would look on the TDT.

User story - biology/bioinformatician who wants to visualise and exploit the annotation transfer on the taxonomy viewed on TDT

As a biologist/bioinformatician I want to look at the annotation transfer to improve the taxonomy I am curating on the Taxonomy Development Tool (TDT). This might help me understand how the annotation transfer implements the annotation present in the taxonomy and how it supports my work of editing cell types/cell sets, adding new cell sets in the taxonomy I am viewing and editing. To assess how I can use the annotation transfer to edit the taxonomy I should be able to : 1) Quantify the level of overlap between two cell sets, one provided by the annotation transfer and the other present in the taxonomy hierarchy. This means that I should be able to quantify how many cell types are part of both annotations. To assess this I want to be able to visualise the number of overlap by using a Jaccard scores or a confusion matrix which should be provided by the TDT. To identify the overlap between two cell sets the jaccard scores is used in the Annotation Comparison Shiny App 2) The extended annotations across two cell sets (the one overlapping in the jaccard scores) which are all the columns for those two cell sets in the annotation table. This would allow me to understand what the two cell sets refer to and to assess their overlap from a biological point of view.

An example annotation transfer that could help me implement the information in my taxonomy is the annotation transfer obtained from mapping my dataset of interest onto the MapMyCell Platform . In TDT, I would export an AnnData file (.h5ad) and upload it to the MapMyCell Platform using the hierarchical mapping algorithm. From the analysis, I would obtain a .csv output file the results would be used to understand how much the cells in the taxonomy in TDT correspond to the ones from the MapMyCell platform. Another example of annotation transfer that could be used to implement the taxonomy and understand the cell hierarchy in the taxonomy is by using a custom annotation transfer from a pre-anaysed .csv file (i.e. in case of the nhp basal ganglia taxonomy, the AIT115_human_BGplus_mapping_results). I could use the TDT to load the annotation transfer files and quantify/analyse how the annotations corresponds to the one of my taxonomy in order to build appropriate hierarchy.

jeremymiller commented 3 months ago

I think this thread captures the various use cases surrounding annotation transfer quite nicely:

Let me know if you want anything else from me on this thread.

dosumis commented 3 months ago

Parking a related idea here before I forget it:

Annotation transfer labelsets should be optional imports - declared with an IRI and stored in the repo of the taxonomy to which annotations have been transferred. We can use a simple formula to roll IRIs and resolve without declaration. @hkir-dev - does this make sense to you?