nlpaueb / edgar-crawler

The only open-source toolkit that can download EDGAR financial reports and extract textual data from specific item sections into nice and clean JSON files.
GNU General Public License v3.0
294 stars 80 forks source link

any plans for parsing the tables into readable format? #16

Open wj210 opened 1 year ago

wj210 commented 1 year ago

Hi, thanks for this useful codebase!

Do you have any plans or idea on how the tables can be parsed into readable format? say using pandas etc?

eloukas commented 11 months ago

Hi @wj210. In our early development, we had such a method but found it unstable in the retrieval results due to all the different table structures. So, we decided to remove it. However, there was a recent pull request about this same functionality. You should check it out and maybe integrate it into your code: https://github.com/nlpaueb/edgar-crawler/pull/17 We will review it too and maybe merge it if it has sufficient results.