Parsing filings with empty imports

ajmedeio commented 10 months ago

Hi @manusimidt,

Been a while, hope you're doing well!

I've started processing some SEC submissions from 2018 and found several hundred filings that had empty imports in their xsd, see this example and ctrl-f <import namespace="" schemaLocation=""/>.

I was hoping we could add an empty string check in the xbrl/taxonomy.py::625-641 file:

import_elements: List[ET.Element] = root.findall('xsd:import', NAME_SPACES)

for import_element in import_elements:
    import_uri = import_element.attrib['schemaLocation']
    if import_uri is None or import_uri == "":
        continue

Parsing that example filing and running our test suite went smoothly after adding the conditional continue statement.

manusimidt commented 10 months ago

Hey @ajmedeio,

thanks for the detailed description and the example! Looks reasonable, I will add this change in the next release. I am currently on vacation but will implement this change in about two weeks.

ajmedeio commented 10 months ago

Awesome, thank you so much, your work is really appreciated!

ajmedeio commented 9 months ago

Good morning @manusimidt! Thanks again for implementing so quickly, if you get the chance, could you publish a new release so we can import the updates?

Lemme know if I can help in anyway.

manusimidt / py-xbrl

Parsing filings with empty imports #115