manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
111 stars 40 forks source link

No support for embedded/inline schemas #124

Open mariuszlipinski opened 7 months ago

mariuszlipinski commented 7 months ago

The library is not working for presentation and calculation schemas embedded inline with main XSD file. Eg. for:

https://www.sec.gov/Archives/edgar/data/789019/000095017024008814/0000950170-24-008814-index.html https://www.sec.gov/Archives/edgar/data/910606/000095017024016260/0000950170-24-016260-index.html

it will not find/parse taxonomies.

mariuszlipinski commented 2 months ago

Could You make please at least a suggestion how the code could be extended to support this type of XSDs? I would be happy to implement it myself and share, but I find it difficult to find correct place to start.

manusimidt commented 2 months ago

Hey, this is unfortunately a bigger change. Also not many XBRL documents are affected by this (at least to my experience)..

My gut feeling would say that the way the parser functions are currently written has to change. Specifically this function: https://github.com/manusimidt/py-xbrl/blob/1b60644185363769c56415b8d993acde47e2ca56/xbrl/linkbase.py#L429

Instead of taking a path or URL to a file where the linkbase is stored, the function has to reviece a string as input which it will then parse. In order to active backwards compatibility I would create a new function like parse_linkbase_from_str() and also use this function internally in the parse_linkbase() function.

Then in the taxonomy module the code has to distinguish between linkbase imports <link:linkbaseRef .... /> and schema inline linkbases <link:linkbase ... />.

If you can propose a change I would be happy to merge it. Please just try to comment it as good as possible, especially since this would probably be a bigger change. I currently struggle finding time for further maintaining py-xbrl next to a full time job and also some other projects. I hope to have some time for it in the summer vacation break