Open mjafin opened 11 months ago
Hi @suqingdong, Thank you for this fantastic package, I'm finding it super useful for my research. I'm going through some newly released abstracts and am hitting an error:
ParserError: Unknown string format: 2023-None-1
When I traced this back to the code, it's coming from
pdat = util.check_date(Article.find('ArticleDate') if Article.find('ArticleDate') is not None else Article.find('Journal/JournalIssue/PubDate'))
and further
def check_date(element): year = element.findtext('Year') month = element.findtext('Month') day = element.findtext('Day') or '1' return parse_date(f'{year}-{month}-{day}')
The issue here is that the article (PMID 36911757) currently has no ArticleDate and PubDate only has year in it, so the month doesn't parse. Any thoughts on how to address this?
the bug has been fixed in version: v1.0.1
@suqingdong cheers for the prompt fix, much appreciated. There is another issue I identified in pubmed_xml/core/parser.py, namely if Article.find('Journal/ISSN') is not present, then Article.find('Journal/ISSN').attrib['IssnType'] will make the code error out. I made a dummy fix at:
https://github.com/suqingdong/pubmed_xml/compare/master...mjafin:pubmed_xml:master#diff-a27d47d226e680c1e795927eefc42106838f5022f6fd56a814b6949f93547d07R62 using
Article.find('Journal/ISSN').attrib['IssnType'] if Article.find('Journal/ISSN') else 'NA'
Hi @suqingdong, Thank you for this fantastic package, I'm finding it super useful for my research. I'm going through some newly released abstracts and am hitting an error:
When I traced this back to the code, it's coming from
and further
The issue here is that the article (PMID 36911757) currently has no ArticleDate and PubDate only has year in it, so the month doesn't parse. Any thoughts on how to address this?