Create a completely preprocessed document file

Now that we have our preprocessing (lemmatization, punctuation removal, etc) complete, we need to preprocess all of our input data. The code for this is simply data['reviewText'] = data['reviewText'].apply(<preprocessor_function>), however, this takes a lot of memory and a long time, so it cannot be safely done in Colab, which might timeout before it is complete. This will probably need to be done locally, and the results can be saved and uploaded (data.to_pickle?).

mattfredericksen / CSCE-4205-ML-Project

Create a completely preprocessed document file #4