Open diana-xie opened 4 years ago
Hi Diana,
Data preprocessing can be tedious and complicated for different applications, we are glad to provide some insights, and feel free to choose the best for yours.
For text preprocessing, as described in our paper, we basically:
For timeseries transformation,
Best, Guanyi
Hi,
Thanks so much for posting the code to your paper. I'm interested in training the model on my own dataset and have downloaded the Botometer datasets. However, I'm not sure how to preprocess the data such that the .npy files in ".../Data/" will be ready to input into the pipeline.
Would you have some code or examples of how objects such as the WordEmb and GAFMTF are produced from Botometer csv's? Or what the table to generate these objects would look like?
Thanks so much!
Best, Diana