kwhitehall / scored

3 stars 6 forks source link

Started work on NLP Stuff. #28

Open elaksana opened 8 years ago

elaksana commented 8 years ago
  1. Performed first two NLP tasks with Stanford's coreNLP library.
  2. Top N wordcount with stemming, same caps, etc have been completed.
  3. Finding datasets (with a not so good heuristic) works.
  4. Some issues with coreNLP timing out on the works cited page.
chrismattmann commented 8 years ago

how are you doing NLP? Are you using Tika CoreNLP with Tika-Python?

elaksana commented 8 years ago

No, we're using a python wrapper over Stanford's CoreNLP: https://github.com/dasmith/stanford-corenlp-python.

chrismattmann commented 8 years ago

darn, that's too bad. would be a great opportunity for coordination.

chrismattmann commented 8 years ago

cc @kwhitehall

elaksana commented 8 years ago

What do you think, Kim?

chrismattmann commented 8 years ago

please note too that the above reference Python library is GPLv2, rather than a permissive license as Tika.

kwhitehall commented 8 years ago

Yip, kool feedback @chrismattmann. @elaksana, let's take a look at the PR, then we can go from there.