Closed emanuil-tolev closed 8 years ago
Forget this for now - overwriting it is actually not a good idea precisely for the same reason that causes the 3 other identifier columns to be overwritten. We actually use the user-uploaded title (rather than Crossref lookup or Core lookup) to look up the article in EPMC. So it's more accurate to leave the user's title in, since the rest of the information on the sheet relates more to the uploaded title, than to the "real" title.
While doing #80 (update docs with overwriting behaviour for different fields), I noticed that if I put in a dummy PMCID or PMID or DOI (alongside one correct identifier), those will be overwritten if the system finds other values in doing its lookups.
However, this is not the case for Article title, which is the fourth identifier column. If I upload a correct PMCID and a dummy title, I get the dummy title back.
I think we should probably change this to be consistent across the 4 columns. Otherwise we risk having the wrong title alongside 3 correct identifiers. Humans identify the article most easily by its title as well.
I'll discuss with Wellcome if there's a particular reason why we should be inconsistent across the 4 ID columns, but in the meantime if anybody has such a reason it would be great to know. Otherwise, if Wellcome are happy with the change I'll submit a PR.
As for the source of the title - I'd probably only enable overwriting in the case where the title comes from Crossref, since Core has been unreliable.