there have already been a few discussions here about performance issues around the hard-coded external dependencies and repeatedly loading language models and starting up the Java Runtime Environment for each document to be processed.
A simple but powerful solution to all of that would be allowing HeidelTime to run as a daemon. It starts the Java Runtime Environment once and loads the necessary language models to memory once. I then keeps running, waiting for input text (eventually via Pipes/NamedPipes/stdin?) to be passed to it and outputs it to stdout(?).
I've seen something similar exists for TreeTagger itself.
Can't really say how much it takes to implement that, but I'm sure it's a good fit to address those problems.
Hello again,
there have already been a few discussions here about performance issues around the hard-coded external dependencies and repeatedly loading language models and starting up the Java Runtime Environment for each document to be processed.
A simple but powerful solution to all of that would be allowing HeidelTime to run as a daemon. It starts the Java Runtime Environment once and loads the necessary language models to memory once. I then keeps running, waiting for input text (eventually via Pipes/NamedPipes/stdin?) to be passed to it and outputs it to stdout(?).
I've seen something similar exists for TreeTagger itself.
Can't really say how much it takes to implement that, but I'm sure it's a good fit to address those problems.