Port from Python2 to Python3.

Changes:

There was a since nonstandard unicode character (the accented i in naive) in the list of negative words. I replaced this character with the correct escape sequence. To properly parse the escape sequence, codecs.open was used instead of the standard open for this file.

The use updated pickle and scikit-learn libraries caused quite a hassle:
- Classification models created using scikit-learn prior to version 0.18.1, did not embed the scikit-learn version into model. Since 0.18.1 does embed its version number, it gave a number of warnings about loading an outdated model.
- I tried countless methods, but Python3's pickle simply cannot derserialize an object that was serialized using Python2's cPickle.
In response to these conjoined errors, I decided to retrain the classification model using the original dataset. Since this required generating dependency parses for each request in the dataset, I chose to dump the annotated dataset, with dependency parses, as a json dictionary in the format expected by /scripts/train_model.py. See the README for information about how to download these datasets (they are too large for version control).
The retrained model is included in this pull-request. Since the json formatted training datasets I have provided are so large, retraining the model will require a significant amount of memory.

I have created an additional file, corpora/download.py, which automates downloading and extracting the json formatted training data.