alexanderpei / DeepEarnings

74 stars 8 forks source link

prevent lookahead bias #1

Open zohaad opened 4 years ago

zohaad commented 4 years ago

In trn_NetCompustat.py you need to pass in shuffle=False to sklearn.model_selection.train_test_split

alexanderpei commented 4 years ago

Hey, thank you for the suggestion. Actually, I need to do more than just removing shuffling to look into this since each company is loaded one after the other into the data set. Will let you know.

zohaad commented 4 years ago

You also need to add a date < filingDate restriction to the split