tanussingh / Big-Data-Management-Analytics-Project

Final Project for CS 6350.001 - Large Scale Data Collection and preprocessing in Spark
3 stars 2 forks source link

Figure out how to store UDPipe output #2

Open ishansharma opened 5 years ago

ishansharma commented 5 years ago

Details to be figured out:

  1. Do we feed entire article to UDPipe or break to sentences first?
  2. The output is a table, we should store it as a list or a dictionary to make it easy to search.
  3. Once output is put into the DataFrame, how do we compare?