Closed marijnschraagen closed 4 years ago
I can indeed replicate this. It seems related to LanguageMachines/ucto#72 .
Well.... The problem is here that frog uses the 'language' nld-vnn which refers to the configuration in /usr/local/share/frog/nld-vnn/
Ucto is then initialized from /usr/local/share/frog/nld-vnn/frog.cfg
using:
[[tokenizer]]
rulesFile=tokconfig-nld-historical
So for ucto the language is nld-historical
This is confusing for us as well the software....
When I run Frog like this:
frog -c /usr/local/share/frog/nld-vnn/frog.cfg -X uit.xml -t txt
all seem well.
So that might be a quick workaround.
As a matter of fact, I am inclined to think that this is an abuse of the --language
parameter.
It is meant to give frog a hint about the languages to detect, and NOT to tell which configuration to use.
When using --languages, frog should ignore the rulesFile information from the frog config file. This was so until @proycon "fixed" it in #80 That was putting the cart before the horse probably.
We need to rethink this.
fixed according to #80
Maybe related to https://github.com/LanguageMachines/frog/issues/80?
When using XML output with non-standard rules there is a token-annotation error. Command:
frog -t myfile.txt -X myresult.xml --language=nld-vnn
Output:
The regular column-based output works without any problems.