yya518 / FinBERT

A Pretrained BERT Model for Financial Communications. https://arxiv.org/abs/2006.08097
Apache License 2.0
560 stars 128 forks source link

script to clean and preprocess SEC filings? #9

Closed vr25 closed 4 years ago

vr25 commented 4 years ago

Hi,

Can you please explain the steps in pre-processing the data since the company filings have tables, etc.?

Thanks!

yya518 commented 4 years ago

We only use Section 1, Section 1A, and Section 7 from the company filings. Those sections do not contain tables.