Closed GoogleCodeExporter closed 9 years ago
Sorry, I don't know how I missed this bug report for so long! Not sure what
happened.
Are you actually talking about running on the full google n-grams corpus? If
so, then you need substantial amounts of memory, much more than the 10GB needed
to store the n-grams once the binary is built. I haven't actually figured out
what the minimum necessary is, but I would think you need at list 50GB of
memory, which is available on large EC2 instances.
However, I have pre-built binaries of these already compiled for you, so you
can just download those (instructions are on the web page).
Original comment by adpa...@gmail.com
on 19 Feb 2012 at 5:48
Original comment by adpa...@gmail.com
on 9 Aug 2012 at 5:30
How long does it take to build the LM on the full n-grams corpus?
Original comment by tur...@gmail.com
on 19 Aug 2012 at 9:33
It takes I think something on the order of 24 hours, maybe a little less. It's
not something I've optimized heavily, so sorry about that. Let me know if you
have any trouble building yourself (other than time and memory issues . . . )
Original comment by adpa...@gmail.com
on 20 Aug 2012 at 6:51
Original issue reported on code.google.com by
tur...@gmail.com
on 24 Nov 2011 at 2:59