titipata / pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
http://titipata.github.io/pubmed_parser/
MIT License
559 stars 164 forks source link

question using pp.parese_medline #123

Closed scm1210 closed 1 month ago

scm1210 commented 1 year ago

Hi team,

I am trying to parse some medline files I have using the demo'd code on the repo. However, when I run the below code:

import os
import pubmed_parser as pp
import pandas as pd

parsed_articles = pp.parse_medline_xml('/....Dropbox/Collected Texts/test_files/pubmed23n1138.xml',
year_info_only=True,
nlm_category=False,
author_list=False)

I get the following error: Error: it was not able to read a path, a file-like object, or a string as an XML

I've checked that my file exists, etc. but still not too sure what the issue could be. Any advice is helpful, thank you!

Michael-E-Rose commented 5 months ago

Can you open the xml file in a normal text editor? I had these errors frequently, but discovered it was due to broken downloads.

PS: You format code with three backticks (```) on separate lines before and after the code.

Michael-E-Rose commented 1 month ago

Issue seems to be solved.