bamler-lab / constriction

Entropy coders for research and production in Python and Rust.
https://bamler-lab.github.io/constriction/
Apache License 2.0
77 stars 5 forks source link

Creating a categorical distribution sometimes fails to converge #20

Closed robamler closed 1 year ago

robamler commented 1 year ago

The following python code

import constriction
import numpy as np
model = constriction.stream.model.Categorical(np.array([0.15, 0.69, 0.15]))

gets stuck in an infinite loop. This is probably caused by a bug in the function optimize_leaky_categorical.

Fixing this might turn out to be a breaking change, so it might require a major version bump.

Additional example:

p = np.array([1.34673042e-04, 6.52306480e-04, 3.14999325e-03, 1.49921896e-02, 6.67127371e-02, 2.26679876e-01, 3.75356406e-01, 2.26679876e-01, 6.67127594e-02, 1.49922138e-02, 3.14990873e-03, 6.52299321e-04, 1.34715927e-04])
constriction.stream.model.Categorical(p/p.sum())

Many thanks to Grégoire Jauvion for reporting this issue by email.

robamler commented 1 year ago

Looks like a non-breaking fix is possible. The proposed fix in #21 only changes behavior in an observable way in cases where the previous version didn't converge at all.