titipata / pubmed_parser

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset
http://titipata.github.io/pubmed_parser/
MIT License
580 stars 166 forks source link

AttributeError: 'NoneType' object has no attribute 'text' when using parse_pubmed_caption #150

Open qm-intel opened 1 month ago

qm-intel commented 1 month ago

@Michael-E-Rose @titipata Thanks for your tool.

When I am using parse_pubmed_caption() from Pubmed Parser for the document that I have attached here

rna-9-860.zip

I am getting the following error:

File "/home/user/myprojects/1-parse-xml-image-caption-inline.py", line 79, in <module>
    caption_dict = pp.parse_pubmed_caption(xml_file_path[0])  # dict_keys(['pmid', 'pmc', 'fig_caption', 'fig_id', 'fig_label', 'graphic_ref'])
  File "/home/user/anaconda3/envs/medline/lib/python3.10/site-packages/pubmed_parser/pubmed_oa_parser.py", line 427, in parse_pubmed_caption
    fig_label = stringify_children(fig.find("label"))
  File "/home/user/anaconda3/envs/medline/lib/python3.10/site-packages/pubmed_parser/utils.py", line 51, in stringify_children
    [node.text]
AttributeError: 'NoneType' object has no attribute 'text'

How to resolve this issue?