alexwalterbos / nlp_fake_news

Group repository for the IN4325 NLP project group NLP_Fake_News
0 stars 0 forks source link

Chained feature generation #15

Closed alexwalterbos closed 6 years ago

alexwalterbos commented 6 years ago

Updated main.py: I've separated reading CLI arguments from the actual preprocessing, those are now passed as function arguments. I've also created helper functions for test&train set size; that way we can cleanly change the sampling size without having to change a lot of hardcoded values

I've removed .pkl files from the repo, and the write logic from the features; those cannot be used anymore because of the sampling logic. We could replace them with a single pkl storing the data after generating features, but I haven't done that yet.

@TomBrunner @MichielvdBerg please look this through, I've had to change some logic here and there to match up data types.

MichielvdBerg commented 6 years ago

Looks good