Closed TomAugspurger closed 6 years ago
This fixes an issue in the old factorization method, which didn't properly account for missing values. Basically
[B, B, NA, NA, A, B]
Should factorize as [0, 0, -1, -1, 1, 0]. Previously, we didn't handle NA so it was [0, 0, 1, 1, 2, 0].
[0, 0, -1, -1, 1, 0]
[0, 0, 1, 1, 2, 0]
Numba gave a 285x speedup (after JIT warmup) on a benchmark with 10,000 values.
This fixes an issue in the old factorization method, which didn't properly account for missing values. Basically
Should factorize as
[0, 0, -1, -1, 1, 0]
. Previously, we didn't handle NA so it was[0, 0, 1, 1, 2, 0]
.Numba gave a 285x speedup (after JIT warmup) on a benchmark with 10,000 values.