monarch-initiative / dipper

Data Ingestion Pipeline for Monarch
https://dipper.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
57 stars 26 forks source link

OrphaNumber now Orphacode? #966

Closed drseb closed 4 years ago

drseb commented 4 years ago

https://github.com/monarch-initiative/dipper/blob/f7a11c8649a5f37dc8d68fb7c95ef811061b144c/resources/orphanet/en_product6_201811.gv#L25

drseb commented 4 years ago

I mean this one:

https://github.com/monarch-initiative/dipper/blob/f7a11c8649a5f37dc8d68fb7c95ef811061b144c/resources/orphanet/en_product6_202005.gv#L49

kshefchek commented 4 years ago

It looks like we're using OrphaCode in the ingest, likely just need to update the gv file, see https://github.com/monarch-initiative/dipper/blob/f7a11c8649a5f37dc8d68fb7c95ef811061b144c/dipper/sources/Orphanet.py#L94

drseb commented 4 years ago

ok. all good then

TomConlin commented 4 years ago

https://github.com/monarch-initiative/dipper/pull/947

they made that change a month or so after their main rewrite

drseb commented 4 years ago

For en_product4: Not sure how you ingest works, but in my SAX parser based scripts, the new "name" elements in DisorderType and DisorderGroup were causing trouble. Also the HPOFrequency element is now different.

TomConlin commented 4 years ago

Should dipper be looking at product4 in addition to product6?

drseb commented 4 years ago

I think dipper uses data produced by phenol or something - in my tests the ingestion of orphanet data by these tools did not work. So check everything when the new release of HPO data is made (this week). I'll let you know if get some information on what is happening

pnrobinson commented 4 years ago

Indeed the new version of phenol has changes and HpoAnnotQc (which produces phenotype.hpoa) seems to work with the new Orphanet files.

drseb commented 4 years ago

@pnrobinson please see other email-threads: it is not working and needs fixing

pnrobinson commented 4 years ago

What I am trying to say is that dipper should use the HPO phenotype.hpoa to get this information if possible -- we should not have multiple sources of truth. The latter is working!

drseb commented 4 years ago

Not sure I understand your point. My message is: phenotype.hpoa is not working at the moment.

pnrobinson commented 4 years ago

Sorry, then I do not understand -- what is not working? I have not yet updated the phenoptype.hpoa online, but the old version as fine (AFAIK) and the new versions are created correctly. If there is an issue please report on HPO or phenol tracker!