Open cannin opened 4 years ago
@PritiShaw can you grab: MeSH, publication year (PubDate), journal (ISOAbbreviation), PMC ID (ArticleId IdType="pmc", if exists)
Hi Mentor I have implemented the suggestions you gave regarding file format and headers. Please find the complete result at All PMID output, it has around 21,000 PMIDs I have also made a Truncated output ~3000 PMIDs so that Github can present the data.
Adding 5 PMIDs for your reference | PMID | JOURNAL_TITLE | YEAR | PMCID | MESH_TERMS |
---|---|---|---|---|---|
10021361 | Curr. Biol. | 1999 | Humans,SLP-76 signal Transducing adaptor proteins,Phosphoproteins,Signal Transduction,GRB2 protein, human,GRB2 Adaptor Protein,SH3 Domains,Receptors, Antigen, T-Cell,Carrier Proteins,Phosphorylation,Nuclear Proteins,DNA-Binding Proteins,Membrane Proteins,Jurkat Cells,NFATC Transcription Factors,Binding Sites,*Hematopoietic System,Amino Acid Sequence,Tyrosine | ||
10022829 | EMBO J. | 1999 | PMC1171179 | Mice,Animals,laminin A,Laminin,perlecan,Dystroglycans,Heparin,Sulfoglycosphingolipids,Heparan Sulfate Proteoglycans,fibulin 2,Extracellular Matrix Proteins,nidogen,Membrane Glycoproteins,gephyrin,Calcium-Binding Proteins,Heparitin Sulfate,Protein Binding,Binding Sites,Recombinant Proteins,Basement Membrane | |
10022833 | EMBO J. | 1999 | PMC1171183 | Stem Cell Factor,GRB2 protein, human,GRB2 Adaptor Protein,SH3 Domains,Signal Transduction,Suppressor of Cytokine Signaling Proteins,Phosphorylation,Receptor Protein-Tyrosine Kinases,Proto-Oncogene Proteins c-kit,Proto-Oncogene Proteins c-vav,Protein Binding,Tyrosine,Cell Proliferation | |
10022860 | Mol. Cell. Biol. | 1999 | PMC83966 | Rats,Mice,Animals,ral Guanine Nucleotide Exchange Factor,PC12 Cells,ras Proteins,Signal Transduction,PI3-Kinase,Guanine Nucleotide Exchange Factors,ral GTP-Binding Proteins,RTN4 protein, human,*Nogo Proteins,NGF protein, human,Nerve Growth Factor,rho GTPases,ras Guanine Nucleotide Exchange Factors,Proto-Oncogene Proteins c-raf,Cell Differentiation,Neurite Outgrowth | |
10022869 | Mol. Cell. Biol. | 1999 | PMC83975 | Transforming Growth Factor beta,Transcriptional Activation,Transcription Factor AP-1,SMAD3 protein, human,Smad3 Protein,SMAD4 protein, human,Smad4 Protein,Promoter Regions, Genetic,Trans-Activators,Proto-Oncogene Proteins c-jun,Gene Expression Regulation,Transcription, Genetic,Binding Site,*Cell Nucleus,Genes, Reporter,Protein Binding,Transfection,Luciferases |
Thanks
@PritiShaw i can make use of them as is to write some code. but can you re-run them and put a "|" between mesh terms. for example, this one has a comma that that will confuse a split:
IGFBP3 protein, human https://meshb.nlm.nih.gov/record/ui?ui=C515497
this would be safer: Transforming Growth Factor beta|Transcriptional Activation|Humans
I only get 2821 rows for the file: https://gist.github.com/PritiShaw/9ad43241c99f727afd04efbe0bdb77e8. Is it truncated?
wc -l all.tsv
2821 all.tsv
I only get 2821 rows for the file: https://gist.github.com/PritiShaw/9ad43241c99f727afd04efbe0bdb77e8. Is it truncated?
I checked the revision history I think it was truncated because I used the Github UI
I have pushed the complete version with |
as the separator for MESH terms
You can find the complete result here https://gist.github.com/PritiShaw/9ad43241c99f727afd04efbe0bdb77e8
There are total 20667 PMIDs
Thanks
@PritiShaw can you grab the MeSH terms for the PMIDs in this file?
https://reactome.org/download/current/ReactionPMIDS.txt
There are many duplicates, so make a unique list. To the same exercise as before MTI and also from PubMed API.