Pyhton, Keras, SciKit-Learn, Matplotlib: Machine learning research project around classification of intent behind tech support emails in order to enable automatic follow up.
More meaningful alternatives to the 'Bag of Words' approach:
I will probably implement an improvement of count occurrence like TF-IDF. I would like to experiment with using phrase embeddings in KNNs - although I have not found very much advice on this so far, and need to conduct more research. I would welcome you advice on this, if you have any suggestions?
Probably TF-IDF, but Word embedding would be ideal