Closed GoogleCodeExporter closed 9 years ago
I'd be surprised if the GAE allowed you to run native binaries. Are you sure
this is allowed?
Original comment by richard.eckart
on 25 Nov 2014 at 11:32
When I try the example from command line (inside GCE)
$ echo 'Hello world!' | cmd/tree-tagger-english-utf8
(see: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)
it works.
Original comment by steffen...@web.de
on 25 Nov 2014 at 11:48
Can you reproduce the original problem or did you just copy the stackoverflow
report here?
If you can reproduce it, what version of TT4J are you using?
Original comment by richard.eckart
on 25 Nov 2014 at 11:50
I'm the author of the stackoverflow report =)
1.2.0
Original comment by steffen...@web.de
on 25 Nov 2014 at 11:56
I see :) Great!
The problematic line in 1.2.0 is this one:
boolean isUnicode = "UTF-8".equals(_model.getEncoding().toUpperCase(Locale.US));
I can see potential for a NPE here, but I wonder why it works locally but not
on the GCE.
Do you provide an encoding for the model?
Does GCE have a problem with Locale.US?
Are you using tt4j directly or within another framework, e.g. in DKPro Core? If
you are using it directly, you might want to give version 1.2.1 a try which
offers a way of setting a model without using a model resolver.
Original comment by richard.eckart
on 25 Nov 2014 at 12:03
My local machine is Win8, the GCE has Debian (so I use different treetagger
packages). My setup is nearly the same as you provide in your example:
System.setProperty("treetagger.home", "/home/spark/resources/treetagger");
try {
//tt.setModel("c:/treetagger/lib/german-utf8.par"); //local
tt.setModel("/home/spark/resources/treetagger/lib/german-utf8.par"); //gce
tt.setPerformanceMode(true);
tt.setHandler(new TokenHandler<String>() {
public void token(String token, String pos, String lemma) {
output.put(token, lemma.toLowerCase().replace("_", " "));
}
});
}
So I use it directly.
I just tried v1.2.1 (but with no changes in my source code) it produces the
same error - Should I change my setup? How?
Original comment by steffen...@web.de
on 25 Nov 2014 at 12:28
When loading a model, you should specify an encoding. This can be done in two
ways:
1)
treetagger.setModel(modelFile.getPath() + ":" + encoding);
2) (works only with 1.2.1+)
DefaultModel model = new DefaultModel(
modelFile.getPath() + ":" + encoding,
modelFile, encoding, DefaultModel.DEFAULT_FLUSH_SEQUENCE);
treetagger.setModel(model);
Original comment by richard.eckart
on 25 Nov 2014 at 1:07
I just found sth. out what I should have tested much earlier:
When I run my app on my local machine, I set an option to run it only on this
one local machine. When I run it in gce, I set an option for a "parallel run",
means, the task will be committed to multiple worker-instances, so that it can
processed parallel.
Now I set the option for "local run" in gce - and it succeeded!
Original comment by steffen...@web.de
on 25 Nov 2014 at 2:47
Ok, sounds this issue can be closed then :)
Original comment by richard.eckart
on 25 Nov 2014 at 3:54
@Steffen: one more question: what does it mean to set the option for "local
run" and how do you think could that be something that indirectly triggers the
NPE?
Original comment by richard.eckart
on 25 Nov 2014 at 5:58
No, it was my fault all the time: Some time ago I deleted and re-installed my
worker-instances but didn't installed treetagger on this worker-instances! I
forgot about that =/
Original comment by steffen...@web.de
on 26 Nov 2014 at 8:32
Original issue reported on code.google.com by
steffen...@web.de
on 25 Nov 2014 at 11:24