limit the memory usage by dynamically recreating the map required for operations or directly reading from the file with binary search (which should be very slow, but use no memory)
compress the FEN before processing and decompress it afterwards?
this would increase the number of possible 3-grams, making the file larger
can we index the individual positions, number those, and associate arrays of these ids with a puzzle id?
test using variable N for the N-grams: 3-grams for most stuff, 4 and larger to capture the too common 3-grams or lesser.
this implies upgrading the NIF format
which should also be formalized and blogged
also add the possibility of adding multiple types of data for the same element, like checksums together with n-grams or n-grams applied on different text representations of the same thing.