Closed mshadbolt closed 3 years ago
Hi @mshadbolt and @willrockout ,
Not all UBERON terms are automatically imported into HCAO. The current makefile is designed to import UBERON terms that a) have a cross-reference to an FMA term, or b) are manually added to a list of terms to be imported, or c) are tagged with the new subset property "added_for_HCA" (this last is used when I create a term in Uberon, CL or EFO that was requested by HCA curators). UBERON:0007795 'ascitic fluid' doesn't fall in any of the 3 categories unfortunately. (The 3 relevant files containing current lists for a), b) and c) can be found at the bottom of this page: https://github.com/ebi-ait/ontology/tree/master/src/ontology).
We'll be happy to add UBERON:0007795 'ascitic fluid' to uberon_manual_import.txt, but we just released HCAO earlier today and the next release is scheduled for April 19th. If you need the term sooner, @zoependlington kindly offered to run a patch release. Please let us know what you prefer.
Thanks, Paola (and Zoe)
Hi @paolaroncaglia,
Thank you for getting back to us so quickly if we can get @zoependlington to do a patch change that would be very much appreciated. Do you know the rough timeframe for that change to go into effect?
Thanks, Will
Hi @willrockout ,
Thanks for letting me know. I'll check with @zoependlington tomorrow and will let you know.
Best wishes, Paola
hi again, I thought I would hijack this thread for my own purposes now, what are the rules around CL imports? Trying to work out why T follicular helper cell
https://www.ebi.ac.uk/ols/ontologies/cl/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCL_0002038 isn't in our ontology either.
Seems pretty weird that the T cell hierarchy is different between the two: EBI OLS HCAO OLS
Hi @willrockout and @mshadbolt ,
The CL import into HCAO works similarly to Uberon given the current makefile. Not all CL terms are imported into HCAO; only CL terms that have an FMA cross-reference get imported automatically, and those that don’t have FMA x-refs don’t get picked up and need to be added manually or with the added_for_HCA tag. T follicular helper cell doesn't have an FMA xref. I suspect that there will be many more CL terms that you'll need and that don't have an FMA xref. Same for Uberon terms. If that's the case, Zoe suggests that she may hold off on a patch release until next week when we can collect together all of the terms needed and make sure they get imported into HCAO? Otherwise, she might be able to get you a patch release by Monday (could be Tuesday your time Marion).
We'll wait to hear from you, Thanks, Paola and Zoe
Ok thanks @paolaroncaglia
It seems like we need a better way to automatically determine what terms get imported into HCAO as it isn't really meeting our needs at the moment and the lead time to request imports and then wait for release will slow down our submission processes.
Would there be any harm in importing the entire CL ontology? Or will it bloat the HCAO too much?
@willrockout are you aware of any other examples where the UBERON term you wanted to use wasn't in HCAO?
@mshadbolt Zoe and I reviewed and discussed the options that we think are available to you for CL (and possibly Uberon) terms. Unfortunately I have another meeting now, so I'll have to get back to you tomorrow with more details; but rest assured, we're giving this thoughtful consideration, and are confident that at least one of the options we'll suggest will meet your needs. Best, Paola
@mshadbolt (cc @dosumis for his information)
It seems like we need a better way to automatically determine what terms get imported into HCAO as it isn't really meeting our needs at the moment and the lead time to request imports and then wait for release will slow down our submission processes.
Would there be any harm in importing the entire CL ontology? Or will it bloat the HCAO too much?
@willrockout are you aware of any other examples where the UBERON term you wanted to use wasn't in HCAO?
Here's a summary of my discussion with @zoependlington .
Zoe is drafting a document that illustrates the current HCA pipeline and explains what terms currently do or don't get imported in HCAO. Same as what we explained in comments above, but inclusive of all imports and could be made available to all HCA wranglers once completed.
Importing the whole of the CL ontology in an application ontology like HCAO kind of defies the purpose of an application ontology. It'd be better to refer to CL separately, the same way as I understand you refer to Mondo and EDAM separately, in their entirety. Also consider that, if we imported all ~2,500 CL terms in HCAO, they would probably come with all the externally imported terms that they link to (Uberon, GO...) which brings the total to ~10,000 terms.
Irrespective of where you'd store CL or Uberon terms, when you need a term that exists elsewhere but is not in HCAO yet, there's always the option of pre-filling curation fields with IDs. The validation script may be modified so it throws a warning rather than an error - it'll flag that you need to enter a "live" term, but it won't fail. I remember that some Gene Ontology curation tools allowed this.
HCAO is released regularly every month now, and I prioritize HCA requests. So e.g. EFO terms that you request for assays, mouse strains etc. would be available within max 4 weeks of your request, as long as tickets are opened before the 15th of each month. For CL and Uberon terms, if you referred to the source ontologies directly, all terms would be available to you immediately, unless they don't exist yet but those seem to be a minority.
Please let us know if you'd prefer to discuss this further, so we can find the best solution(s) for HCA. Thanks, Paola (and Zoe)
@paolaroncaglia @zoependlington
Could you also push CL:0002410 (pancreatic stellate cell) to HCAO?
@pnejad Sure, I made a note in our (Zoe's and mine) agenda for the next HCAO release, currently scheduled for April 19th (i.e. 2 working days after the EFO release, to bring in any requested new EFO term).
Thank you @paolaroncaglia!
I have another one we need for HCAO (CL:0005006 - ionocyte). Should I keep adding them to this ticket?
Hi @pnejad ,
I have another one we need for HCAO (CL:0005006 - ionocyte). Should I keep adding them to this ticket?
Yes that's fine, you can add terms that need importing to this ticket. I already made a note for CL:0005006 - ionocyte, same as above. Best,
Paola
Irrespective of where you'd store CL or Uberon terms, when you need a term that exists elsewhere but is not in HCAO yet, there's always the option of pre-filling curation fields with IDs. The validation script may be modified so it throws a warning rather than an error - it'll flag that you need to enter a "live" term, but it won't fail. I remember that some Gene Ontology curation tools allowed this.
I like that idea.
Irrespective of where you'd store CL or Uberon terms, when you need a term that exists elsewhere but is not in HCAO yet, there's always the option of pre-filling curation fields with IDs. The validation script may be modified so it throws a warning rather than an error - it'll flag that you need to enter a "live" term, but it won't fail. I remember that some Gene Ontology curation tools allowed this.
I like that idea.
Thanks again to @zoependlington who brought it up first :-)
Irrespective of where you'd store CL or Uberon terms, when you need a term that exists elsewhere but is not in HCAO yet, there's always the option of pre-filling curation fields with IDs. The validation script may be modified so it throws a warning rather than an error - it'll flag that you need to enter a "live" term, but it won't fail. I remember that some Gene Ontology curation tools allowed this.
I like that idea.
Thanks again to @zoependlington who brought it up first :-)
I don't really see how that would work.
We validate against the HCAO (https://ontology.archive.data.humancellatlas.org/index), so as far as our validation script is concerned, no ontology terms exist outside of it. The JSON validator It is also a generic tool that is used by others as well, so we don't really have control over when it throws warnings rather than validation errors.
The other option that was suggested by Will was to validate against the original ontologies directly, i.e. MONDO, EFO, UBERON etc. But then what is the point of having HCAO at all?
Having warnings seems like a slippery slope to having a bunch of terms that aren't in HCAO at all, and maybe not even in the ontology we reference (i.e. CL,UBERON etc.), how would we ensure the terms get added eventually? Also how do we ensure others using our standard know that they have to request terms when there are warnings and not just ignore them since their submission will be 'valid'?
I think it's worth having a meeting to look in to your OLS build (likely needs updating) and the validations that work from it. These things are easily configured. Maybe some time in early May?
Hi @mshadbolt ,
@dosumis , @zoependlington and I discussed a few options. Here’s a summary with action items:
[x] In the short term: import all of CL into HCAO for the April 19th HCAO release. Zoe will try to assess before the release if the size of CL will cause issues. I will try to compile a list of CL terms that don’t apply to mammals/humans, so that if the size of the entire CL causes issues for HCAO import, we can exclude some terms and see if that works better.
[x] In the medium/long term: use taxon slims of Uberon and CL in HCAO, i.e. subsets of those ontologies where the aspect “does this term apply to human or not” is encoded via taxonomic annotation properties. There was a plan to create taxon slims of Uberon and CL already, but it didn’t progress lately, though things are moving again now. To firm up a concerted, broader strategy,
[x] I will set up a separate meeting for taxon restrictions with Uberon and CL developers and representatives from GO (this meeting would not involve HCA wranglers). Quick note for self: this ticket and this ticket could be references, and this more recent one, and there's a whole GitHub project board on improving taxon constraints here. [Update 15/4/21, I started with https://github.com/obophenotype/uberon/issues/1824#issuecomment-820527965] [Update 19/4/21, this is now superceded by https://github.com/HumanCellAtlas/ontology/issues/80; and I'll meet Nicole Vasilevsky next week to discuss and assign tasks.]
[x] Independent of taxon work: set up a meeting in early May (from May 10th onwards) to look into HCA OLS build, update it if necessary, address/configure the validations that work from it if applicable. I’ll contact Marion via Slack to work out dates and people involved. Invite Parisa, Ray, Rachel, other EBI wranglers - ask to extend the invite - open to developers too. [Update 19/4/21: we've now arranged for Henriette to attend the next monthly HCA/SCA/CCF curators call, to start discussion and plan more focused meetings as necessary.]
Also, to clarify, in reply to “Having warnings seems like a slippery slope to having a bunch of terms that aren't in HCAO at all, and maybe not even in the ontology we reference (i.e. CL,UBERON etc.), how would we ensure the terms get added eventually?”. We never meant to suggest that you reference non-existing classes, but rather terms that already exist in CL and Uberon and therefore have a valid and stable ID. Other concerns remain valid, of course, and may be addressed at the May meeting as necessary.
Thanks,
Paola
@paolaroncaglia Could you please add lupus erythematosus (MONDO:0004670) to HCAO?
Hi @pnejad ,
Could you please add lupus erythematosus (MONDO:0004670) to HCAO?
My understanding is that HCA looks up the entire Mondo ontology. In fact, using the link I have for the HCAO OLS production instance, I can find lupus erythematosus (MONDO:0004670) as expected. Please let me know if I'm missing something, but I thought that all Mondo terms were available to you?
Thanks, Paola
(Update April 14th: at today's curators meeting we resolved this issue. Parisa had been looking at the public OLS rather than the HCA production instance. The former only shows the application ontology, so no Mondo or EDAM or HANCESTRO.)
Ticket summary/notes for self: after next HCAO release,
Hi @mshadbolt , @willrockout and @pnejad ,
FYI, @zoependlington made a new release of the HCA ontology today. It will be available in the HCA OLS production instance when this ticket is addressed. The terms that you requested in this ticket will be visible there then: UBERON:0007795 ascitic fluid CL:0002038 T follicular helper cell CL:0002410 pancreatic stellate cell CL:0005006 ionocyte
In fact, all CL terms have been imported in HCAO temporarily while we wait for a human-focused slim version of CL to be available (will take some time). So from now on all CL terms should be immediately available to you. Uberon terms, instead, should still be requested for ad-hoc import, as Uberon is too large to be imported in HCAO. We are, however, actively working on making a human-focused slim version of Uberon available.
Any issue or question please let us know.
Thanks, Paola (and Zoe)
Thanks for all your hard work on this @paolaroncaglia and @zoependlington !
One of our wranglers @willrockout is trying to use the term
UBERON:0007795
in our metadata but even though it exists in the EBI OLS version of UBERON it can't be found in the HCA OLS .Is this because it is a new term and it will be in the next release or is there some other reason why this term hasn't been imported into HCAO?
thanks!