glygener / glygen-issues

Repository for public GlyGen tickets
GNU General Public License v3.0
0 stars 0 forks source link

PMID Discrepancy for human_proteoform_phosphorylation_sites_iptmnet.csv #1841

Closed ubhuiyan closed 22 hours ago

ubhuiyan commented 1 week ago

I ran the check-citation-datasets.py and it came up with a flag for human_proteoform_phosphorylation_sites_iptmnet.csv for PMID: 26513018.

[sbhuiyan28@glygen-vm-dev unreviewed]$ grep "26513018" human_proteoform_phosphorylation_sites_iptmnet.csv
"P38398-1","1552","Tyr","P42685-1","FRK","protein_xref_pubmed","26513018","protein_xref_iptmnet","P38398"

[sbhuiyan28@glygen-vm-dev unreviewed]$ grep "26513018" human_proteoform_phosphorylation_ciations_sites_iptmnet.csv
grep: human_proteoform_phosphorylation_ciations_sites_iptmnet.csv: No such file or directory

This PMID does not exist within the corresponding citations CSV, so I could not trace the paper back to the title or other forms of identification. I did a simple Google search for the PMID and it appears to be associated with a paper titled "Local generation of fumarate promotes DNA repair through inhibition of histone H3 demethylation" whose PMID is 26237645. This PMID appears to also be present within human_proteoform_phosphorylation_sites_iptmnet.csv but not the corresponding citations CSV as well.

[sbhuiyan28@glygen-vm-dev unreviewed]$ grep "26237645" human_proteoform_phosphorylation_sites_iptmnet.csv
"P07954-1","236","Thr","P78527-1","PRKDC","protein_xref_pubmed","26237645","protein_xref_iptmnet","P07954"
"P07954-1","236","Thr","","","protein_xref_pubmed","26237645","protein_xref_iptmnet","P07954"

[sbhuiyan28@glygen-vm-dev unreviewed]$ grep "26237645" human_proteoform_phosphorylation_ciations_sites_iptmnet.csv
grep: human_proteoform_phosphorylation_ciations_sites_iptmnet.csv: No such file or directory
rykahsay commented 2 days ago
[sbhuiyan28@glygen-vm-dev unreviewed]$ grep "26237645" human_proteoform_phosphorylation_ciations_sites_iptmnet.csv
grep: human_proteoform_phosphorylation_ciations_sites_iptmnet.csv: No such file or directory

I think you need to be mindful (be present) when you are doing these things --> I don't know how the above message you got can be interpreted as "26237645" is missing in the citation dataset.

It looks like 26237645 in human_proteoform_phosphorylation_sites_iptmnet.csv and human_proteoform_citations_phosphorylation_sites_iptmnet.csv:

$ grep 26237645 unreviewed/human_proteoform_*phosphorylation_sites_iptmnet.csv 

unreviewed/human_proteoform_citations_phosphorylation_sites_iptmnet.csv:"P07954-1","Local generation of fumarate promotes DNA repair through inhibition of histone H3 demethylation.","Nature cell biology","2015 Sep","Jiang Y, Qian X, Shen J, Wang Y, Li X, Liu R, Xia Y, Chen Q, Peng G, Lin SY, Lu Z","protein_xref_pubmed","26237645","protein_xref_iptmnet","P07954"

unreviewed/human_proteoform_phosphorylation_sites_iptmnet.csv:"P07954-1","236","Thr","P78527-1","PRKDC","protein_xref_pubmed","26237645","protein_xref_iptmnet","P07954"

unreviewed/human_proteoform_phosphorylation_sites_iptmnet.csv:"P07954-1","236","Thr","","","protein_xref_pubmed","26237645","protein_xref_iptmnet","P07954"

On the other hand, 26513018 is only in human_proteoform_phosphorylation_sites_iptmnet.csv and we need to find out why it is missing in human_proteoform_citations_phosphorylation_sites_iptmnet.csv:

$ grep 26513018 unreviewed/human_proteoform_*phosphorylation_sites_iptmnet.csv 
unreviewed/human_proteoform_phosphorylation_sites_iptmnet.csv:"P38398-1","1552","Tyr","P42685-1","FRK","protein_xref_pubmed",26513018","protein_xref_iptmnet","P38398"
ubhuiyan commented 2 days ago

There is no paper affiliated with PMID: 26513018. I mentioned earlier that I believe it was previously the PMID for "Local generation of fumarate promotes DNA repair through inhibition of histone H3 demethylation" but that PMID must have been changed to 26237645.

rykahsay commented 22 hours ago

We are moving without 26513018