LexPredict / openedgar

OpenEDGAR (openedgar.io)
MIT License
294 stars 94 forks source link

What's expected as outcome of processing? #13

Open abolotnov opened 5 years ago

abolotnov commented 5 years ago

Hi,

I've gotten it installed per the instruction in local mode and downloaded filing index for 2018 and process_all_filing_index(year=2018, form_type_list=["10-Q"])

celery picked up and after some time ended up with a lot of txt (looks like mixture of txt and HTML) content in edgar/data folder and records in _companyinfo and _filing, _filingdata records. But no actual content broken down into sections/individual pieces. Is this expected outcome? Do I need to do additional processing to extract the actual content?

Also, the django app - is this just a skeleton and not supposed to do anything other than user registration and login/logout?

thanks!

johllmichael commented 5 years ago

Last I checked, this script was written for 10-Ks. Try, "10-K" instead. I know C++, but I am learning Python so I can figure out how to get this work. Do you use Visual Studios for coding?

abolotnov commented 5 years ago

I gave up on this one, it does work, but I don't understand how to validate it completed everything properly because I don't understand what it's outcome should ultimately be. Tika keeps dying with out of memory, regardless of different configurations I tried, including large instances. Besides, looks like developers abandoned the project or something.

jcrben commented 4 years ago

@abolotnov did you find anything better?