scrapinghub / python-crfsuite

A python binding for crfsuite
MIT License
770 stars 221 forks source link

Are there any plans to expose cross validation and other training performance outside of stdout? #42

Open samgalen opened 8 years ago

samgalen commented 8 years ago

As far as I can tell, right now, when someone uses the holdout feature on a Trainer object (when verbose is set to True), you get a printout of information at each step about how the trainer is training and how it's performing on the holdout group. Aside from creating a tagger object and essentially repeating the validation that's already happened, the printout appears to be the only way that information is exposed.

It would be really handy to be able to access training information in a less clunky way! Unless, of course, there's some other way, that I haven't seen yet.

kmike commented 8 years ago

Hey @samgalen,

It is not documented, but there is a way to access this training log: it is parsed by https://github.com/tpeng/python-crfsuite/blob/master/pycrfsuite/_logparser.py, and the logparser object is available as trainer.logparser. You may find trainer.logparser.iterations or trainer.logparser.last_iteration attributes useful.

samgalen commented 8 years ago

Oh that's exactly what I was looking for!

Would you be averse to a pull request to add these features to the example notebook?

kmike commented 8 years ago

Yeah, that'd be nice!