dgunning / edgartools

Python library for working with SEC Edgar
MIT License
324 stars 70 forks source link

NVDA 10-K "Item 1" does not provide the full item text #23

Closed gchinna closed 4 months ago

gchinna commented 6 months ago

NVDA 10-K "Item 1" does not seem to provide the full item text. February 24, 2023 - 10-K: Annual report for year ending January 29, 2023

NVDA 10-K filing "Item 1" text spans from page 4 to page 15. However below code provides only part of "Item 1" text page 4 through middle of page 11.

tenk = Company("NVDA").get_filings(form="10-K").latest(1).obj()
print(f"NVDA item 1 text:\n{tenk['Item 1']}")

Is there another way to get the full "Item 1" text correctly?

dgunning commented 6 months ago

Hi,

I have found an issue where the library used for parsing the html - unstructured - does not handle inline xbrl tags. I am working to fix this and will add this 10-K to the test cases. Will update you on this.

Thanks

dgunning commented 4 months ago

Sorry for the long wait, the fix required a lot of rework. This is fixed in 2.10.1