Closed tjstum closed 6 months ago
It's easy to fix this, but it will come with a cost. Keyword argument dispatch will become slower for everyone. I am not sure if the cost outweighs the benefits.
This would probably make the code pretty unwieldy, but could keyword argument dispatch first try the address-based check, and then, before giving up on the overload (or, I guess after trying all overloads but before giving up completely), try the slower, equality-based approach?
If that's too much, do you have any other suggested workaround?
CPython itself (initialize_locals
in ceval.c) does the id comparison first, and falls back to contents comparison if there's no match for a given kwarg. That would limit the penalty to functions that take var-keywords and also have some named keywords that aren't provided, which I'm guessing would be more palatable? I can work on a patch.
Problem description
In CPython, strings can be interned (and this happens automatically based on functions' keyword names, it appears), but that doesn't mean that you can't get an equivalent but distinct string:
The overload dispatch loop relies on the address (Python's
id
, what theis
operator uses) being equivalent for the keyword argument used in the binding's.def
call and that actually passed in as the keyword argument.In the example code, you can see the issue that this causes. By changing the nb_func line I highlighted above to
PyUnicode_Compare
even on CPython, the example code stops reproducing. Therefore, a "simple" fix would be to usePyUnicode_Compare
unconditionally.While this particular example may seem contrived, it actually comes up in our application when the dictionary is received via unpickling (you can also get a different string object from
pickle.loads(pickle.dumps(<some string>))
.Reproducible example code