Open manusimidt opened 2 years ago
This would affect mainly the instance module, but there both XBRL and iXBRL parsing is affected since these are separate functions.
Hi,
Thanks for the idea, I did some code changes, added a try catch block as you suggested and used beautiful soup to fetch the wrongly filed data from the XML file directly.
Here is my code:
try:
concept: Concept = tax.concepts[tax.name_id_map[concept_name]]
context: AbstractContext = context_dir[fact_elem.attrib['contextRef'].strip()]
except KeyError:
print(f"\nAll facts with concept \t" + concept_name + "\t will be ignored, due to invalid concept definition\n")
#print (f"this is the path \n", instance_path)
from bs4 import BeautifulSoup
file = open(instance_path,"r", encoding="utf-8")
contents = file.read()
soup = BeautifulSoup(contents, 'xml')
tag_list = soup.find_all()
for tag in tag_list:
if tag.name == concept_name:
print("This is the wrongly filed concept :\n" + concept_name + "\nThis is it's data:\n" + tag.text)
continue
Now, i am getting the values on terminal, sure. But the final result is in dataframe, How do i append this result to the final dataframe? can you please help?
(this is my terminal result hope you are able to see this)
Thanks and regards.
Is there a way where i can integrate this result for wrongly filed concept names and add them to the "facts"
There seem to be many changes in parameters of the functions so I would rather wait for you to give an update regarding this.
Thanks
This change is now live in version 2.2.0
This could also apply to context id's (see #86)
This could also apply to missing or not locatable taxonomies. #112 #76
Implement some functionality that allows also for parsing XBRL reports that are violating the XBRL standart. Maybe just issue a warning and continue with parsing instead of crashing completely.
(from discussion:) Hey,
the concepts are defined in the different taxonomy schemas imported by the instance document.
For example: The first submission you provided failed at the concept:
"in-ca:WhetherApprovalTakenFromBoardForMaterialContractsorArrangementsorTransactionsWithRelatedParty"
which is prefixed by xmlns"in-ca"
. This xml namespace refers to the taxonomy with namespace"http://www.icai.org/xbrl/taxonomy/2016-03-31/in-ca"
. This is linked to the schema file located at https://www.mca.gov.in/XBRL/2016/07/26/Taxonomy/CnI/IN-CA/in-ca-2016-03-31.xsd. There you can check that the above mentioned concept is really not defined.=> Thus the creator of this filing incorrectly used this non-existing concept which is why
py-xbrl
crashes.The problematic line is the following: https://github.com/manusimidt/py-xbrl/blob/7be61f7dfe19491ef29ca917be0876c4da98284e/xbrl/instance.py#L336
Here I just expect the tax.name_id_map to have the given concept (which it also should according to the XBRL standard).
There where several discussions bevore about "How to treat incorrect XBRL". Because many users of
py-xbrl
just wan't to get data out of the reports and do not care if the report could be parsed 100%.I plan to implement a functionality which would allow you to parse submissions that are incorrect (and maybe just issue a warning). But I am not able to work on py-xbrl until Mid July (due to university stuff).
So in the mean time i would suggest to just but a "try-catch" block around the line where it's failing. Like the following (untested):
Originally posted by @manusimidt in https://github.com/manusimidt/py-xbrl/discussions/83#discussioncomment-3020257