buruzaemon / natto

A Tasty Ruby Binding with MeCab
BSD 2-Clause "Simplified" License
143 stars 16 forks source link

Use new Model- and Lattice-based C APIs internally #40

Closed buruzaemon closed 9 years ago

buruzaemon commented 9 years ago

Where possible, use the new Model interface and Lattice API calls.

buruzaemon commented 9 years ago

From example/example_lattice.c in the MeCab source distribution, Model, Tagger and Lattice appear to all be intended as objects with long life.

Wrap these objects in Natto::MeCab and set them as instance variables.

Make sure to destroy them together, in the correct order of Tagger, Lattice and finally Model.

buruzaemon commented 9 years ago

Add new support for these Model-based functions in mecab.h:

Remove support for these old C-API calls, and use mecab_lattice_set_request_type instead:

Remove support for mecab_set_theta, and use mecab_lattice_set_theta instead.

Remove support for these old C-API calls for tostr, and use mecab_lattice_set_request_type with MECAB_NBEST or MECAB_ONE_BEST; use mecab_lattice_tostr; and use mecab_lattice_next for N-best parsing to string:

Remove support for these old C-API calls for tonode, and use the existing logic as per boundary constraint parsing via Lattice:

buruzaemon commented 9 years ago

For all calls to Lattice, use mecab_lattice_strerror to obtain error messages.

buruzaemon commented 9 years ago

OK, so if we want to also support feature constraint parsing, I think we will have to refactor the internals to use the Lattice-based APIs right now.

This will be done as part of 1.0.0 release.

buruzaemon commented 9 years ago

Done.