Pubmed IDs and content for genes retrieved from indra.literature module.
Script: get_pubmedstuff4genes.ipynb
and pub2date.ipynb
It includes some nasty beautifulsoup parsing plus some indeirect watys to also get the dates and additional stuff.
And it needs a long time to get all articles because the their servers have some restrictions.
So there might be a way of doing it better and easier.
Pubmed IDs and content for genes retrieved from indra.literature module. Script: get_pubmedstuff4genes.ipynb and pub2date.ipynb
It includes some nasty beautifulsoup parsing plus some indeirect watys to also get the dates and additional stuff. And it needs a long time to get all articles because the their servers have some restrictions. So there might be a way of doing it better and easier.
Check out: https://biopython.org/DIST/docs/api/Bio.Entrez-module.html Check out: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6821292/