The "pattern" mode was using too much memory for GPU, and I also didn't like the fact that we had to do binarization, so this instead creates a factor for each bigram instead of a factor for the whole rule.
[x] Haven't tested yet whether it fits into GPU memory
The "pattern" mode was using too much memory for GPU, and I also didn't like the fact that we had to do binarization, so this instead creates a factor for each bigram instead of a factor for the whole rule.