DiseaseOntology / HumanDiseaseOntology

Repository for the Human Disease Ontology.
Creative Commons Zero v1.0 Universal
337 stars 108 forks source link

Import Makefile #801

Closed melodyswen closed 4 years ago

melodyswen commented 4 years ago

We updated the text file for ncbitaxon_terms.txt to add the new coronavirus, and after running 'make ncbitaxon', it doesn't seem to be populating in the .owl file. We noticed in the makefile that the source for ncbitaxon, among other ontologies, are pulled from this link (OBO = http://purl.obolibrary.org/obo/) which doesn't seem to be valid.

beckyjackson commented 4 years ago

I'm running the ncbitaxon build right now, so I'll see what happens when that's complete.

That is the correct prefix for the OBO Foundry ontologies, e.g. http://purl.obolibrary.org/obo/doid.owl, so if they aren't being loaded from there, there may be something wrong with the server. As far as I can tell right now, though, it's properly redirecting to the ontology files. If something isn't loading for you, could you give me the specific ontology IRI?

melodyswen commented 4 years ago

Hi Becky,

Yes, let me know! Lynn pushed an updated ncbitaxon_terms.txt file with the new term.

beckyjackson commented 4 years ago

Two things:

  1. NCBITaxon is too big to load into memory with the current memory allocations. I upped the memory allocations for running ROBOT in the imports Makefile. *
  2. The ID for coronavirus (NCBITaxon:227859) does not show up in the NCBITaxon file at http://purl.obolibrary.org/obo/ncbitaxon.owl

Here's the entry from OntoBee which shows that while it uses the NCBITaxon namespace, it is not present in the NCBITaxon file. image

And if you go to the IRI, http://purl.obolibrary.org/obo/NCBITaxon_227859, it cannot be found. It looks like there are a variety of coronaviruses in NCBITaxon, though, if you search the browser at https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi

@lschriml - how would you like to proceed? I could manually add this term in to the import, but I'd really like to know more about where it came from... Did another group make it, or was it originally in NCBITaxon but got merged into something else? I'll look into it a bit.

* We are also working on a way to better handle big imports, which I'll implement for DO as soon as it's merged in.

beckyjackson commented 4 years ago

It looks like it's been merged into NCBITaxon:694009 - Severe acute respiratory syndrome-related coronavirus

@lschriml I will update this to use this ID instead.

melodyswen commented 4 years ago

Hi Becky,

Sorry if I wasn't clear: we specifically wanted this term, to reflect the 2019-nCoV coronavirus. https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=2697049

beckyjackson commented 4 years ago

That's in there as well :)

EDIT: apologies, spoke too soon. Let me look into this.

beckyjackson commented 4 years ago

It looks like while that is active on NCBITaxonomy's browser, it is not yet included in the NCBITaxonomy release. Here's the Ontobee entry: image

I will manually add this in for now.

lschriml commented 4 years ago

Hello @beckyjackson, @melodyswen How should we proceed so that the DO taxonomy import reflects the most up to date name for this organism: NCBI_Taxon: 2697049 Severe acute respiratory syndrome coronavirus 2

-- We have had a request, ticket : #839 for this update.

To solve this, can we edit our local import for an expedient fix ?

If that is possible, I would like to run another release with the fix, as this is a needed ID/name.

Cheers, Lynn

lschriml commented 4 years ago

Closing the ticket. We devised a work around, as the Taxonomy file does not include SARS2. Lynn