NCATSTranslator / Feedback

A repo for tracking gaps in Translator data and finding ways to fill them.
7 stars 0 forks source link

Decide how to handle missing/deprecated PMIDs #882

Open andrewsu opened 1 month ago

andrewsu commented 1 month ago

As raised by @bill-baumgartner in https://github.com/NCATSTranslator/Feedback/issues/625#issuecomment-2237699515 and https://github.com/NCATSTranslator/Feedback/issues/803#issuecomment-2225869209, there are cases where KPs are returning PMIDs that are no longer in PubMed. Creating this issue to track how this should be handled.

andrewsu commented 1 month ago

@Genomewide asked:

How do we get the KPs that provide PMIDs to filter this? We can not do anything about this? @andrewsu Do you filter these out? Are there other teams that need to look at this group to filter that you know of?

I think there are two questions -- how to solve this problem, and what's the priority of this issue.

Genomewide commented 1 month ago

That sounds reasonable to me. I was not thinking about the small fraction that it is. Agree that this is likely a low priority. This is the first time I have seen a ticket for it. Microservice would be great. Do you want to call this closed or snoozed?

sierra-moxon commented 1 month ago

I'm going to "next phase" it.

codewarrior2000 commented 2 weeks ago

@andrewsu Earlier this year in February, I wrote a little code to check whether PMIDs were retracted or had errata. Extending that code to check for missing PMIDs and then wrap it all up into a microservice sounds intriguing.