SuLab / WikidataIntegrator

A Wikidata Python module integrating the MediaWiki API and the Wikidata SPARQL endpoint
MIT License
244 stars 46 forks source link

possible problem when a journal has more than one ISSN #173

Open egonw opened 3 years ago

egonw commented 3 years ago

Many journals have two ISSN numbers, one for the print edition, one for the electronic edition.

egonw commented 3 years ago

This seems to be the cause why the WikiPathways bot gives fails on certain articles. It reports the pmid is not in Wikidata, but when it is, often the journal the article is published in has two ISSN numbers.

LeMyst commented 3 years ago

Hi @egonw Do you have a code example?

egonw commented 3 years ago

Sorry for the delay. I don't have a code example, but PubMed identifier 11908751 is one for which we find this. It looks like https://github.com/SuLab/WikidataIntegrator/blob/main/wikidataintegrator/wdi_helpers/publication.py#L375-L488 is returning a None. cc @andrawaag

egonw commented 3 years ago

Oh, @andrawaag just found that that PubMed has a page(s) that could be problematic: it has one of the many possible hyphens, where as the code only supports one hyphen: https://github.com/SuLab/WikidataIntegrator/blob/main/wikidataintegrator/wdi_helpers/publication.py#L444

But I am not sure what the pass does in line https://github.com/SuLab/WikidataIntegrator/blob/main/wikidataintegrator/wdi_helpers/publication.py#L454 Would that explain why we get a None?

egonw commented 3 years ago

Sorry, that was silly of me. Wrong source (EuropePMC does use that hypen).

andrawaag commented 1 year ago

@egonw PubMed support in WDI has since been deprecated. I am currently addressing the lag in issues. Can you confirm if this issue still persists when running your bots with DOIs?

egonw commented 1 year ago

@andrawaag, can you plz assign the issue to me? I am not sure when I will have time, but then I have it on my todo list that I will revisit once a week.