manusimidt / py-xbrl

Python-based parser for parsing XBRL and iXBRL files
https://py-xbrl.readthedocs.io/en/latest/
GNU General Public License v3.0
111 stars 40 forks source link

Parsing Failures for Empty Fact Values and 'nil' Text in XBRL Documents #113

Open rahulsinghal11 opened 1 year ago

rahulsinghal11 commented 1 year ago

Hi I would like to draw your attention to 2 issues I encountered while parsing XBRL documents. I have identified two simple yet effective solutions to these problems.

Firstly, I noticed that when the fact value within an XBRL document is an empty string, the parsing process fails eg: ></ix:nonFraction>. This gives an error converting the string to float. To overcome this issue, I propose a minor modification above line 575 of the instance.py file in the py-xbrl repository: https://github.com/manusimidt/py-xbrl/blob/03b40e3d64221782a294b15c89be060bce34ad90/xbrl/instance.py#L575 fact_value = '0' if if fact_value == '' else fact_value By implementing this change, the parsing functionality will handle empty fact values gracefully.

Secondly, I encountered another problem related to the text within the XBRL document. In some cases, the string 'nil' is used in the document, which causes parsing failures for numerous XBRL documents eg: >Nil</ix:nonFraction>. This gives an error converting Nil to a number. To address this, I suggest the following addition to line 314 of the init.py file within the transformations module: https://github.com/manusimidt/py-xbrl/blob/03b40e3d64221782a294b15c89be060bce34ad90/xbrl/transformations/__init__.py#L314 if arg == 'no' or arg == 'none' or arg == 'nil': Incorporating this adjustment will ensure that 'nil' is recognized and handled correctly during the parsing process.

I have thoroughly tested these two fixes with various XBRL documents and successfully parsed all of them. Therefore, I believe these changes will greatly improve the reliability and robustness of the py-xbrl library.

I kindly request your review and consideration of these proposed solutions. Your feedback and thoughts on these modifications would be greatly appreciated.

Thank you for your attention and dedication to maintaining the py-xbrl library.

manusimidt commented 1 year ago

Hey @rahulsinghal11 , sorry for the late answer.. Thank you for the proposal!

According to the XBRL Specification an ix:nonFraction element must have a value. image

That's probably the reason why I never encountered issues with this on submissions from USA and UK.

However, since this is really easy to fix (as you also stated) I am happy to implement it in the next release. Could you attach the XBRL file you encountered issues with (or send it to me via email)? I always find it helpful to test the change on the failing XBRL document.

Thanks!

rahulsinghal11 commented 1 year ago

Hi @manusimidt Thank you for your willingness to implement the fix in the next release. I'm unable to share the exact document due to its confidential nature. However, I assure you that I have thoroughly tested the changes using a range of documents in large numbers.

These changes are non-disruptive and will not negatively impact any existing functionality and only add additional checks to prevent potential errors that would have otherwise anyway occurred. The modifications are careful enough to ensure stability and enhance the overall performance of the library.

If there's anything else I can do to assist you or contribute to further improving this library, please don't hesitate to let me know. Thanks