Closed justaddcoffee closed 4 years ago
looks very good.
aside: If they internally use the lowercase curie-prefix then that is what we would expect to find in the wild then at some point we should consider doing what they do too.
notes: It looks as this ingest has not seen many of the standard refactors most other have So thanks for fixing the extra stuff you spotted on the way
around line 233;
the global logging instance for this ingest is 'LOG.'
for log messages, pep depreciates .format()
strings in flavor of printf % syntax.
Does running make
as we do to test, regenerate Julie's alternative curie-prefix yaml file? If so please include it in this PR.
On the import typing... there is no point. shiny always wins.
notes: It looks as this ingest has not seen many of the standard refactors most other have So thanks for fixing the extra stuff you spotted on the way
around line 233; the global logging instance for this ingest is 'LOG.' for log messages, pep depreciates .format() strings in flavor of printf % syntax.
Does running make as we do to test, regenerate Julie's alternative curie-prefix yaml file? If so please include it in this PR.
On the import typing... there is no point. shiny always wins.
Does running make as we do to test, regenerate Julie's alternative curie-prefix yaml file? If so please include it in this PR.
All done, except the make/curie-prefix yaml thing - not sure what this, or how this relates to the BioGrid fix?
bottom of the Makefile https://github.com/monarch-initiative/dipper/blob/master/Makefile#L133
Part of Julies identifier work includes collecting various alternative curie-prefixes
so when a non canonical one is used (say in a search)
it can be translated to a canonical one.
She began with just a manual list but we augment it with any we find in the local translation tables.
You added a local translation for an alternative curie-prefix you found in the wild
"UNIPROT_PRO": "UniprotKB"
and it should be automatically added to the list by running make
then committing any regenerated files that show up under translationtable/generated/
That alternative might already be there from some previous contact in which case
nothing would happen.
just checked : https://github.com/monarch-initiative/dipper/blob/master/translationtable/generated/prefix_equivalents.yaml
The alternative you found should trigger an update
That Makefile
no longer runs on my machine, but when I manually run this command:
https://github.com/monarch-initiative/dipper/blob/master/Makefile#L133
no files change.
This command does actually change a file (prefix_equivalents.yaml
):
https://github.com/monarch-initiative/dipper/blob/master/Makefile#L140
so I've committed this change.
I would suspect it never really did, but it was never intended to and no one has the resources to address it, so the best I can do is assure you that if you ran on any other dipper developer machine, any monarch server, our Travis or Jenkins instances, it would/should.
for example when on tartini if I type
make -n
the botton of the result is
awk -F '"' '/^"[^"]+": "[^":]+".*/\
{if($2 != $4 && ! match($2, /[0-9]+/))\
print "\"" $4 "\"\t\"" $2 "\""}' \
translationtable/wormbase.yaml translationtable/mpd.yaml translationtable/monochrom.yaml translationtable/animalqtldb.yaml translationtable/flybase.yaml translationtable/string.yaml translationtable/orphanet.yaml translationtable/mmrrc.yaml translationtable/rgd.yaml translationtable/ensembl.yaml translationtable/bgee.yaml translationtable/zfinslim.yaml translationtable/mgislim.yaml translationtable/hgnc.yaml translationtable/impc.yaml translationtable/go.yaml translationtable/ebi_g2p.yaml translationtable/kegg.yaml translationtable/coriell.yaml translationtable/panther.yaml translationtable/clinvar.yaml translationtable/omia.yaml translationtable/gwascatalog.yaml translationtable/eom.yaml translationtable/mychem.yaml translationtable/hpoa.yaml translationtable/omim.yaml translationtable/mmrc.yaml translationtable/ctd.yaml translationtable/reactome.yaml translationtable/mgi.yaml translationtable/zfin.yaml translationtable/udp.yaml translationtable/genereviews.yaml translationtable/monarch.yaml translationtable/biogrid.yaml translationtable/alternate_curie_prefix.yaml translationtable/ucscbands.yaml translationtable/ncbigene.yaml translationtable/sgd.yaml translationtable/decipher.yaml | LC_ALL=C sort -u > /tmp/local_inverse.tab
echo "---\n# prefix_equivalents.yaml" > translationtable/generated/prefix_equivalents.yaml;
LC_ALL=C join translationtable/generated/curiemap_prefix.txt /tmp/local_inverse.tab|\
awk '{v=$1;$1="";print substr($0,2) ": " v}' | sort -u >> translationtable/generated/prefix_equivalents.yaml
which at least appears would generate
translationtable/generated/prefix_equivalents.yaml
Thanks for making the effort to generate and commit the file, it is greatly appreciated.
... especially since it brought in more than you changed ... which means something is amiss ... which means I better check it
So thanks again.
ahh, not a make/script problem. not missing new mappings, just existing mappings rearranged.
They will get corrected next time anyone else runs it.
However the alternative mapping is still not there.
The reason is the discrepancy between what is in curie_map.yaml
https://github.com/monarch-initiative/dipper/blob/master/dipper/curie_map.yaml#L231
and what has been added.
Okay, not sure I can help any more with this Biogrid fix. I'm going to close this PR, open it on Monarch's dipper repo, and you can push whatever fixes are necessary to the YAML files. Does that suit you?
addressing uniprotkb curie-prefix is minimal. leaving regenerating the prefix file is probably not worth your effort.
addressing uniprotkb curie-prefix is minimal. leaving regenerating the prefix file is probably not worth your effort.
Okay, push whatever changes you think are necessary to this or the other PR (#927)
Handing this off to you @TomConlin
Some interactors (col1 and col2 of
BIOGRID-ALL-3.5.184.mitab.txt
) are now biogrid CURIEs (e.g.biogrid:1221377
) instead of gene IDs (e.g.entrez gene/locuslink:3638344
):This PR changes the code to accept these biogrid CURIEs as gene IDs (and also breaks out the logic for this decision into a method, and tests the method in
tests/test_biogrid.py