redpony / cdec

Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) context-free formalisms
http://cdec-decoder.org/
Apache License 2.0
183 stars 77 forks source link

arithmetic overflow in word-aligner/da.h #69

Open froiss opened 9 years ago

froiss commented 9 years ago

In functions ComputeZ and ComputeDLogZ, line 33 and 50:

const unsigned num_top = n - floor;

floor may be greater than n, which makes num_top reach values close to 2^32.

It happens in real life when the following conditions are met:

In that situation, fast_align.cc's main sometimes invokes ComputeDLogZ with i > n. ComputeDLogZ then calls ComputeZ with i > m (which triggers an assert error if the asserts are commented out).

Note: This is obviously related to those two commits https://github.com/clab/fast_align/commit/5fe669ed08617d54f57577e75944f2e25c68d466 https://github.com/clab/fast_align/commit/adfadde4c129026790224b04a67ba5b8c0c89840 from the clab/fast_align repo, although I am not quite sure why the second reverted the first.