titipata / pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
http://titipata.github.io/pubmed_parser/
MIT License
559 stars 164 forks source link

Question for extracting text #121

Closed shrimonmuke0202 closed 4 months ago

shrimonmuke0202 commented 1 year ago

Hi team, How can I use parse_medline_xml to extract full text from the paper?

titipata commented 1 year ago

Parse MEDLINE only extract abstract and metadata. You have to download the Pubmed Open Access corpus to parse the full text.