Open DSuveges opened 10 months ago
@remo87 , would you mind checking with @DSuveges on this one? Thanks!
The problem still exists, as the above quirey flags the following pmids:
+--------+-------------------------+
|pmid |pmcids |
+--------+-------------------------+
|32790207|[PMC10081512, PMC7285927]|
|33376052|[PMC7709584, PMC7983453] |
|31745814|[PMC7574644, PMC6940410] |
+--------+-------------------------+
That's all. Out of the millions of pmids. If there's no way to find out which is the "real" pmcid, we can just drop them.
As a user I want unambiguous pmid -> pmcid mappings in literature because now multiple pmcid is linked with a single pmid leading to confusing behaviour. (for more context, see #3053 and #2970)
Although the bug is originated upstream to OT, we should take care of this ambiguity before publicly releasing the post-etl literature data. By nature, this issue only affects full text articles.
Tasks
Acceptance tests