FYI - this will clone / build docker parsey - it can take > 60 mins to compile.
If you're intention is to simply tokenize sentences - you may get a lot of mileage out of NLTK.
http://www.nltk.org/
After faffing around for weeks getting this to compile - I think a smarter way forward is to spit out a binary and include in this repo as opposed to building all sources.
If someone can help script this downloading / pegged to a git commit I'd welcome this as a PR.
https://docs.docker.com/engine/installation/
make start
make rebuild-all
Will build a version of parsey mcparseface with patches to expose protobuffers on http://0.0.0.0:9000
syntax = "proto3";
package cali.nlp;
import "syntaxnet/sentence.proto";
message ParseyRequest {
repeated string text = 1;
};
message ParseyResponse {
repeated syntaxnet.Sentence result = 1;
};
service ParseyService {
rpc Parse(ParseyRequest) returns (ParseyResponse);
}
https://github.com/johndpope/DockerParseyAPI/tree/master/clients/node_client
https://github.com/johndpope/DockerParseyAPI/tree/master/clients/ios_client
Dockerfile for Myungchul Shin patches on syntaxnet https://github.com/dsindex/syntaxnet/blob/master/README_api.md
Original API work from David Mansfield https://github.com/dmansfield/parsey-mcparseface-api
https://github.com/tensorflow/tensorflow/
https://github.com/tensorflow/models/tree/master/syntaxnet
https://developers.google.com/protocol-buffers/