tanussingh / Big-Data-Management-Analytics-Project

Final Project for CS 6350.001 - Large Scale Data Collection and preprocessing in Spark
3 stars 2 forks source link

Figure out how to do Doc2Vec with either Gensim or Spacy #7

Open ishansharma opened 5 years ago

ishansharma commented 5 years ago

We will be using Spacy for NER already. If it can do Doc2Vec as well, that should save some resources.

ishansharma commented 5 years ago

Libraries:

  1. Spacy
  2. Gensim

Gensim is much easier to work with in my experience and I haven't used Spacy for this purpose but worth taking a look if it saves resources.