Closed richard-jones closed 9 years ago
@richard-jones wondering if you know of a way to validate these? E.g. do they have to have 7 integers?
Currently the logic I've done is:
It may be fine as it is, just checking if you know something additional. The only thing I can see is that maybe PMCID-s always have 7 integers. I have submitted a query to the EuropePMC Helpdesk to check this.
I hope we do get an answer from them, otherwise it's a bit hard to validate well without at least a length constraint :).
_rx = r'^(PMC){0,1}[\d]+$' # Valid: PMC1234567, 1234567, PMC2, 2, 34, 594876985749654
EPMC helpdesk response
Thank you. If it contains ‘PMC’ and between 5 and 7 digits it is a PMC ID. Please find below one of the earliest examples and the latest:
http://europepmc.org/articles/PMC61055
http://europepmc.org/articles/PMC4217746
In the Europe PMC Advanced Search you can search by License Type:
In the Basic Search box you could input a search like this:
(TITLE:"adrenal gland surgery") AND (LICENSE:"CC-BY")
Yah, it works now. http://howopenisit.org/lookup/PMC2654146 , the PMCID goes in the identifier column, everything works without a hitch. Nice job @richard-jones who originally decoupled which identifier is being used from the provider URL and scraping in OpenArticleGauge.
OAG currently supports PMID and DOI identifier types. It should also be made to support EPMC identifiers, and be able to resolve them to the Europe PMC web page, which is accessible at
http://europepmc.org/articles/[PMCID]
for example, these are two valid PMC urls:
http://europepmc.org/articles/PMC4160115 http://europepmc.org/articles/PMC4132119
This should be an identifier plugin, similar to doi.py or pmid.py, and be able to resolve the pmcid to the url provided above (probably sufficient just to craft the URL from a normalised identifier - checking it exists should probably not be necessary, as this will be done when the licence is detected)