manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
100 stars 37 forks source link

undefined entity exception #58

Closed mrx23dot closed 2 years ago

mrx23dot commented 2 years ago

When parsing: https://www.sec.gov/Archives/edgar/data/0001764013/000121390019023173/f10q0919_healthsciencesacq.htm

I get exception: undefined entity: line 15, column 172

via inst = XbrlParser(cache).parse_instance(url)

py-xbrl==2.0.7

manusimidt commented 2 years ago

image

The file f10q0919_healthsciencesacq.htm is just a normal HTML document and thus can not be parsed by py-xbrl. Please use the corresponding Instance Document for this particular filing: https://www.sec.gov/Archives/edgar/data/1764013/000121390019023173/hsac-20190930.xml

mrx23dot commented 2 years ago

Oh yeah, I forgot to handle that:

try:
  inst = XbrlParser(cache).parse_instance(url)
except xml.etree.ElementTree.ParseError:
  print('not ixbrl/xml')

Cheers!