I used this package to compute the so called concatenated minimum permutation word error rate.
While this package is faster than kaldialign, kaldialign computes the individual ins/del/sub.
I compared the distance values for long texts (several thousand words) and they were not the same.
It turned out, that this package uses another implementation (edit_distance_dp), when the number of words is larger than 640.
In the code of edit_distance_dp is a bug, the first value in the vector is not initialized.
This PR contains a fix for the bug and adds tests for edit_distance_dp.
I used this package to compute the so called concatenated minimum permutation word error rate. While this package is faster than kaldialign, kaldialign computes the individual ins/del/sub.
I compared the distance values for long texts (several thousand words) and they were not the same. It turned out, that this package uses another implementation (
edit_distance_dp
), when the number of words is larger than 640.In the code of
edit_distance_dp
is a bug, the first value in the vector is not initialized.This PR contains a fix for the bug and adds tests for
edit_distance_dp
.I had to expose
edit_distance_dp
to write a test.