sorgerlab / indra

INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.
http://indra.bio
BSD 2-Clause "Simplified" License
173 stars 65 forks source link

indra literature id_lookup bug #1403

Closed Crispae closed 4 months ago

Crispae commented 1 year ago

I was checking Indra's literature id_lookup functionality. While i look up literature using doi it doesn't return PMID or PMCID. But , when i used PMID of the same literature it return the same DOI. I guess reverse mapping is not available.

Screenshot 2023-03-03 103316
Crispae commented 1 year ago

This endpoint of NCBI: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term={doi}[doi] will return PMID from DOI. It can be implemented for reverse.

Replace {doi} with original DOI.

Crispae commented 1 year ago

I guess the Endpoint being used in Indra: https://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/ is not, may be I am implementing this Endpoint wrong.

https://github.com/sorgerlab/indra/blob/master/indra/literature/pmc_client.py. In this an additonal url(above mentioed) for DOI can be added, that will handle the DOI to PMID conversion.

bgyori commented 1 year ago

Hi @Crispae, the issue is indeed that there are multiple endpoints (e.g., idconv vs esearch) through multiple web services (PubMed, PMC, CrossRef) that can be used to convert between paper identifiers and they all provide somewhat different results. The indra.literature.id_lookup function uses https://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/ which fails to retrieve other IDs for the DOI in your example even when called outside the context of INDRA. This is not generally the case though, here is an example DOI for which the lookup works:

In [1]: from indra.literature import id_lookup

In [2]: id_lookup('10.1093/nar/gks1195', 'doi')
Out[2]: {'doi': '10.1093/nar/gks1195', 'pmid': '23193287', 'pmcid': 'PMC3531190'}

If you have a specific set of DOIs you want to convert, you might want to test outside webservices like different endpoints that PubMed, PMC and CrossRef provide to see which of those works well on those.

bgyori commented 4 months ago

I'll close this, @Crispae if you have any further questions, please reach out.