manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
100 stars 37 forks source link

Explicitly match protocol #118

Closed ajmedeio closed 9 months ago

ajmedeio commented 9 months ago

Hi @manusimidt,

It's a shame every time we talk it's about failed filings! I hope you're doing well and your studies are challenging but rewarding.

I encountered an error when processing a filing with SEC EDGAR accession number: 0001010549-18-000409 which contains filenames that start with http, including the custom taxonomy. Unfortunately, this means the checks for importing a remote file versus local file makes the wrong determination as you'll see in the PR's changes.

ajmedeio commented 9 months ago

Hi @manusimidt,

Wondering if you had a chance to look at this. Lemme know if I can elaborate on anything or help otherwise.

manusimidt commented 9 months ago

Hey, sorry for the late reply. Thank you for your pull request! That's a very strange filing you found there.

They have the following line in their taxonomy extension schema file:

<schema ... targetNamespace="http://http/20180930">

and the following line in their instance file:

<link:schemaRef xlink:href="http-20180930.xsd" xlink:type="simple"/>

Usually the schemaRef and the targetNamespace are either relative or absolute URLs... I'm almost surprised that the changes you suggested will solve the problem. But either way, they look like a reasonable change, so I will test it and implement it.

Thanks for the pull!

ajmedeio commented 9 months ago

I totally agree, out of millions of filings, this is the only case I've seen of this. It's as if they had a bug in their side while producing it. We'll be testing again with the new fix and I'll keep you posted.

Thanks again!