wolfe-pack / wolfe

Wolfe Language and Engine
https://wolfe-pack.github.io/wolfe
Apache License 2.0
135 stars 17 forks source link

Working practical and competitive linear chain application #26

Closed riedelcastro closed 10 years ago

rockt commented 10 years ago

There was indeed a bug in reading in the Genia training corpus. I accidentally used the sample corpus as training corpus. The correct training corpus contains over 20k sentences. Wolfe seems to be quite memory hungry at the moment — I needed to provide 12G RAM. Results after 20 epochs of average perceptron learning (~1h training and testing) look quite good, although there is still much space for improvement. The performance is head-to-head with the winner of the task 2004 (72.5 F₁): http://acl.ldc.upenn.edu/coling2004/W1/pdf/19.pdf

Train:
Total Gold:  109588
Total Guess: 120081
Precision:   0.852466
Recall:      0.934089
F1:          0.891413

Test:
Total Gold:  19392
Total Guess: 22570
Precision:   0.673859
Recall:      0.784292
F1:          0.724894
rockt commented 10 years ago

What could improve the performance:

riedelcastro commented 10 years ago

Excellent! I think there is quite a bit of memory usage improvement to do, particularly in the MPGraph generation step. But this is a great start...