chokkan / crfsuite

CRFsuite: a fast implementation of Conditional Random Fields (CRFs)
http://www.chokkan.org/software/crfsuite/
Other
648 stars 208 forks source link

Improve example/template.py for speed, functionality, and readability #100

Closed tianjianjiang closed 5 years ago

tianjianjiang commented 6 years ago

This revision trades memory footprint and multicore for speed. For example, when processing 50k sentences of some Japanese corpus, if the original version spent 3 minutes and 37 seconds, the revised version only needed 20 seconds with 4 processes.

cynthia commented 5 years ago

I think this patch has good parts and unnecessary parts.

Redundant calls, Py3 compatibility and cleaner/idiomatic code is something I think is needed. That said, the parts I don't agree is the performance patches. I strongly believe that example code should have a strong focus on it's pedagogical properties; nothing more, nothing less. Introducing extra complexity for the sake of performance in example code increases the learning curve, and seems like it would bring in more harm than gain.

cynthia commented 5 years ago

See also: https://github.com/chokkan/crfsuite/issues/109

tianjianjiang commented 5 years ago

@cynthia I don't have strong opinions about this PR. It is indeed nice to keep one and only one strong focus for each example.