Closed lwaldron closed 1 year ago
Sorry, but it doesn't seem like a good idea to support that on bugsigdb.org itself as it leads to study duplication:
https://bugsigdb.org/Study_580 (PMID entered without leading 0) https://bugsigdb.org/Study_731 (PMID entered with leading 0)
I would actually say this should be prevented directly on bugsigdb.org. I'll open an issue if you agree.
As for representing PMIDs as numeric or character. For type safety, I would actually rather make the case that those should be characters, very much the same argument why we represent Entrez Gene IDs or NCBI Taxon IDs as characters. Those are identifiers and not numbers.
Now that you've pointed out the duplication problem on the wiki I agree, about type character and preventing leading zeroes in PMID on the wiki. Note that PMID type had changed from numeric to character since the last working GHA.
We have one corner case (https://bugsigdb.org/Study_731) where a curator entered a leading zero for the PMID. The PubMed website ignores leading zeros (for example, try https://pubmed.ncbi.nlm.nih.gov/00000031682463/), so it works normally on bugsigdb.org. We should ignore it too by importing the PMID column as numeric. I noticed this because the PMID column was numeric before the exports broke 3 weeks ago, and now it is character.