manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
100 stars 37 forks source link

Failed namespace-uri parsing #48

Closed mrx23dot closed 3 years ago

mrx23dot commented 3 years ago

Found a few rare cases, parsing these zipped fillings gave "namespace couldn't be found" errors. It worked great on 2000 other symbols.

See SEC response to this at the end. They seem to suggest that the lib doesn't follow 301 perm.redirection, but I don't think that's the case because code says: _session.get(url, headers=headers, allow_redirects=True) Or the response is in a special html header. (location)

symbol LIVX
   https://www.sec.gov/Archives/edgar/data/0001491419/000121390021036869/0001213900-21-036869-xbrl.zip
   LIVX The taxonomy with namespace https://protect2.fireeye.com/v1/url?k=ae6c5932-f1f761c4-ae6cbd84-8681010e5614-13f54919d528bfc9&q=1&e=be10b465-8bbe-4d86-b910-e99fa6de80f7&u=http%3A%2F%2Ffasb.org%2Fus-gaap%2F2021-01-31 could not be found.

symbol REX
   https://www.sec.gov/Archives/edgar/data/0000744187/000093041321001146/0000930413-21-001146-xbrl.zip
   REX The taxonomy with namespace https://protect2.fireeye.com/v1/url?k=dee7a9e9-817c911f-dee74d5f-8681010e5614-259af4ed9b5caf8c&q=1&e=be10b465-8bbe-4d86-b910-e99fa6de80f7&u=http%3A%2F%2Ffasb.org%2Fus-gaap%2F2021-01-31 could not be found.

symbol MILE
   https://www.sec.gov/Archives/edgar/data/0001819035/000121390021027739/0001213900-21-027739-xbrl.zip
   MILE The taxonomy with namespace http://xbrl.sec.gov/stpr/2018-01-31 could not be found.

symbol MOTS
   https://www.sec.gov/Archives/edgar/data/0001686850/000121390021026081/0001213900-21-026081-xbrl.zip
   MOTS The taxonomy with namespace http://xbrl.sec.gov/stpr/2018-01-31 could not be found.

SEC's respons:

We assume you are aware that the namespace-uri is not the URL, but rather, that the namesace-uri designates a location (URL). So, we suspect that your software is trying to retrieve the taxonomy file http://xbrl.sec.gov/stpr/2018/stpr-2018-01-31.xsd. Please note that URL will return a 301 response header:

HTTP/1.1 301 Moved Permanently Date: Mon, 19 Jul 2021 19:52:38 GMT Server: AkamaiGHost Location: https://xbrl.sec.gov/stpr/2018/stpr-2018-01-31.xsd Connection: Keep-Alive Content-Length: 0

Not all software products will interpret this response correctly (although we haven't seen this particular problem in a couple of years).

manusimidt commented 3 years ago

@mrx23dot Both issues #47 and #48 should now be fixed. I will publish a new version of the libary in the next few days, since I have to fix another issue before publishing a new version of the package.

mrx23dot commented 3 years ago

I will test the changes, seems to be t he best parsing lib so far, thank you!

manusimidt commented 2 years ago

@mrx23dot The fixes are now included in the new release V2.0.5.