bridgedb / datasources

Repository with the BridgeDb data source.
Creative Commons Zero v1.0 Universal
4 stars 8 forks source link

Curate remaining entries to the Bioregistry #29

Open cthoyt opened 2 years ago

cthoyt commented 2 years ago

After lots of careful curation, there are only four resources listed in this repository that I can't quite figure out

datasource_name system_code website_url linkout_pattern example_identifier entity_identified single_species identifier_type uri regex official_name wikidata_property bioregistry
Gramene Arabidopsis EnAt http://www.gramene.org/ http://www.gramene.org/Arabidopsis_thaliana/Gene/Summary?g=$id ATMG01360-TAIR-G gene Arabidopsis thaliana 1 EnAt AT[\dCM]G\d{5}-TAIR-G Gramene Arabidopsis nan nan
Gramene Maize EnZm http://www.ensembl.org http://www.maizesequence.org/Zea_mays/Gene/Summary?g=$id GRMZM2G174107 gene nan 1 EnZm nan Gramene Maize nan nan
Gramene Rice EnOj http://www.gramene.org/ http://www.gramene.org/Oryza_sativa/Gene/Summary?db=core;g=$id osa-MIR171a gene nan 1 EnOj nan Gramene Rice nan nan
Rice Ensembl Gene Os http://www.gramene.org/Oryza_sativa http://www.gramene.org/Oryza_sativa/geneview?gene=$id LOC_Os04g54800 gene Oryza sativa 1 Os nan Rice Ensembl Gene nan nan

Example URLs:

So the question is for the first two, what should we call these in Bioregistry? should they really get their own prefixes or is there a more general Gramene resolver for all of these IDs?

For the last two, can these be fixed? Maybe just need a new example from the same pattern.

cthoyt commented 2 years ago

@egonw these are the ones that aren’t complete

DeniseSl22 commented 2 years ago

@mkutmon @Finterly : could you please check these databases, if they're used in any GPML?

DeniseSl22 commented 2 years ago

And @tabbassidaloii : could you check if the databases above are part of our new BridgeDb mapping files?

tabbassidaloii commented 2 years ago

We have mapping files for Arabidopsis thaliana (At), Zea mays (Zm), Oryza sativa japonica (Oj), and Oryza sativa indica (Oi).

egonw commented 1 year ago

We have mapping files for Arabidopsis thaliana (At), Zea mays (Zm), Oryza sativa japonica (Oj), and Oryza sativa indica (Oi).

@tabbassidaloii, but do we have mappings on those to Gramene?

Chris-Evelo commented 1 year ago

FWIW I think these pathways were created by the Gramene team at the time.

tabbassidaloii commented 1 year ago

Gramene

@egonw No, we don't. Not sure if BioMart provides it.