dasher-project / dasher-web

Dasher text entry in HTML, CSS, JavaScript, and SVG
https://dasher-project.github.io/dasher-web/browser/
MIT License
45 stars 8 forks source link

Working prediction model #16

Closed agutkin closed 4 years ago

agutkin commented 4 years ago

This PR has several notable changes:

  1. Simplified the bookkeeping logic (for now) to always fill the context afresh. This is to make sure that we retrieve predictions for the exact text passed to the callback.
  2. Additional debugging functions to make sure the interface to PPM is sane (these are enabled by enabling the verbosity flag).
  3. Got rid of tiny set of Enron sentences, replacing them with more decent sized corpora Alice's adventures in Wonderland and Adventures of Sherlock Holmes from Project Gutenberg. For now these are stored as const strings under third_party/gutenberg, the LICENSE file points here.
  4. The predictor now exposes two APIs: the actual predictor (ppmModelPredict) and a function to retrain the model from scratch using the static training data and the text supplied by the caller (ppmModelReset).

Once we are in agreement that this predictor functions as expected, I'll optimize for speed and possibly memory.