Open dosumis opened 5 months ago
CC @jeremymiller @UCDNJJ - here's the draft plan I came up for work we need to co-ordinate on for BG and other CAS instances.
This plan sounds great. A few thoughts:
This plan sounds great. A few thoughts:
- Re: "Could add in CellGuide LLM descriptions if available" --> I wonder if we could do this using Lyida's knowledge base which (if I remember correctly?) is already trained on basal ganglia papers. In either case, I think it's worth testing this.
Cell Guide has these so we can pull them into the comparison spreadsheet anyway. I'm hoping that we can feed the final text back to CellGuide. Also makes sense to experiment with Lydia's LLM thing (I think Nelson already has this in mind).
- Re: "It would help to have some small-ish test h5ad file + taxonomy to test on" --> We plan to have several public taxonomies in AIT format available in the next couple of weeks. This will give you quite a few files to choose from. Alternatively, we could share a subset of a full BG file (e.g., the whole taxonomy with the data removed, or a taxonomy subsetted to 5 cells per cluster).
Sounds good
- Re: "Taxonomy evolution and TDT" --> I would recommend not working on this yet. I feel we are not at a place yet in this discussion where we are ready to build. +1 to a dedicated meeting about this.
Agreed. Just thought important to have a meeting on this in the Roadmap. I don't want this to hit us unawares and new data is coming.
- Re: "Generating CAS taxonomies" --> I'd recommend focusing effort on future/current taxonomies and not spending time on existing ones, although we could prioritize a few key ones if desired.
This is all about making sure we have taxonomies for everything currently being used in annotation transfer. There was lots of mention of this in talks at the meeting. The aim is to ensure we can formally record annotation transfer on the cell level or as evidence on the cell set level - using cell set IDs from CAS taxonomies.
The SEA-AD taxonomy would be worth including. I'm happy to discuss this one (it has ONE taxonomy of identical clusters, but applied to TWO brain regions)
As we care about it's use in Annotation transfer - it sees reasonable to have both unless some combined dataset is also being used for annotation transfer, in which case we should have that one.
Thanks for putting together this summary David and I agree with Jeremy's points. Although I will be evolving the NHP BG taxonomy, as we discussed, and can share a new cell x cluster map for y'all to look at in the next few weeks once we've validated the approach.
I'll get the new NHP BG taxonomy sheet shared tomorrow and if you can include the CellGuide descriptions that would be fun to look at. I need to talk with Lydia more about how we could use her chatbot but that is a route I would like to take in general.
Co-ordination with Nelson/Jeremy
Cell type ontology descriptions
Data formats, transformation and workflows
Anndata updates from TDT
Taxonomy evolution and TDT:
From discussion with Nelson - he and others will build on established BG taxonomies, adding new cells. As they add new cells, these will be merged into existing clusters where justified, but made into new clusters where not. (Presumably these will live under existing superclasses)
Nelson would prefer to treat these updated taxonomies as taxonomy versions rather than new taxonomies. This will happen iteratively (~ every 2 months?) We will therefore need to manage in TDT: - Changes to cluster membership; Addition of new clusters (will these initially sit outside the taxonomy?).
Generating CAS taxonomies
It's on the ABC atlas
ABC atlas data catalog
CxG has:
Looks like this = 2 anndata files for different regions vs one integrated one on ABC atlas. Make 2 CAS for these anyway?