dgunning / edgartools

Python library for working with SEC Edgar
MIT License
324 stars 70 forks source link

How do i extract a specific section of a form? #66

Open gopi-tookitaki opened 1 week ago

gopi-tookitaki commented 1 week ago

For example, if in a form 10-k how do i extract section 5 only

dgunning commented 1 week ago

Convert the 10-K filing into a TenK Data Object


tenk = filing.obj() 
tenk["Item 5"]
baskargopinath commented 1 week ago

Thanks, what are u using to extract the sections?

dgunning commented 1 week ago

Pretty complicated code. I can explain but it would be a blog article

gopi-tookitaki commented 1 week ago

@dgunning can u say the high level approach, how u chunk it based on sections?

gopi-tookitaki commented 1 week ago

also if the documet item is 5.02, if i do tenk["Item 5"] it doesnt work it has to be exactly 5.02. is there any workaround for this? for e.g if i wanna take 100 companies and extract all their section 5's from the 10k

dgunning commented 6 days ago

Try

item_5s = [item 
           for item in tenk.items 
           if item.startswith("Item 5")]
baskargopinath commented 52 minutes ago

@dgunning in item 7 (MD & A) of the 10k, there is a section called Revenue recognition, but i cant seem to extract out that section, is it possible to do with ur library?