christian-westbrook / interactive-sentence-predictor

An interactive sentence predictor that generates likely continuations of partial sentences.
0 stars 0 forks source link

Lrgfile builder #9

Closed rfishe02 closed 6 years ago

rfishe02 commented 6 years ago

I'm concerned that we won't be able to process extreme amounts of data unless we take a two phase approach. I split the Builder class into NGramBuilder and MapBuilder. The NGramBuilder writes ngrams to disk. Then, I used Linux commands to consolidate the output. The MapBuilder creates maps and writes them to disk.