Closed wildug closed 11 months ago
Thanks for reporting! Please note that there's a bug in the following line:
dummy_entropy_model = constriction.stream.model.CustomModel(cdf ,inverse_cdf, -10, 10)
The argument -10
states that the CustomModel
should support symbols from the integer range {-10, -9, ..., 10}
. Therefore, constriction
correctly assumes that it should be allowed to evaluate the cdf at negative values.
Changing the argument from -10
to 0
fixes the encoding part, but now decoding fails. This is still technically correct behavior as CustomModel
does not guarantee to only probe the cdf within the range of supported symbols (being allowed to probe outside the range of supported symbols slightly simplifies the code that calculates the exact inverse cdf). But looking at this example, I now agree that this can cause confusion, and I'll look into fixing it. Also, I agree that the error messages should be better.
Assertions generally can't be used to tell an interpreter or compiler not to do something. They are a tool for (i) documenting invariants for human readers of the source code and for (ii) bailing in case a program or library contains a logic error or is used incorrectly.
This is nothing specific about constriction
(or even about python). For example, the code below throws an AssertionError
because it calls divide
with argument b=0
at some point even though the assert
statement inside divide
states otherwise. This is because an assert
statement never enforces the asserted condition. It just checks if the condition is met, and interrupts control flow if it isn't.
def divide(a, b):
assert b != 0
return a / b
print([divide(1, x) for x in range(-10, 11)])
Preventing callers from calling a function unless certain conditions are met cannot be achieved with assertions inside the function body. It would instead require either a more expressive type system that makes invalid function arguments inexpressible, or a more elaborate mechanism such as design by contract.
I tried to implement a custom entropy model that is only defined on positive numbers. To make sure only positive numbers are encoded I used an assert statement to check for non-negativity. To make my point I use a lognormal distribution in the following snippet.
thread '<unnamed>' panicked at 'TODO: PyErr { type: <class 'AssertionError'>, value: AssertionError(), traceback: Some(<traceback object at 0x7f22f0efac80>) }', src/pybindings/stream/model/internals.rs:380:14
PanicException: TODO: PyErr { type: , value: AssertionError(), traceback: Some() }
This problem is of course solved by deleting assertions from cdf.
A more permanent solution might be a change to the documentation or the error message.