Open jorisvandenbossche opened 10 months ago
Some more details on the inner Python interpreter details about why this is happening: when calling the function with *args
or **kwargs
, i.e. with a variable number of arguments, the opcode CALL_FUNCTION_EX
is being used.
When calling the function in a standard way with specified keywords, prior to Python 3.11 the opcode CALL_FUNCTION_KW
was being used. But with Python 3.11, this has been optimized, and this opcode is replaced with KW_NAMES
+ CALL
opcodes. And apparently this optimized implementation results in less references to the object on which the method is called.
We actually noticed this different ref count for the standard case, and therefore use a different threshold for Python 3.11+:
But that means that for the args/kwargs use case, we incorrectly don't trigger a warning.
If we want, we probably would be able to inspect the stack to try to figure out whether the method was called with args/kwargs or not, if we are OK with paying the code complexity for this and the runtime overhead. Short-term we probably won't include this, but if it turns out to be a more common case, we can always try that in the future. I experimented a little bit with this, so posting here the code for posterity.
While working on https://github.com/pandas-dev/pandas/pull/56402 (ensuring full test coverage for the warning we raise when you do chained inplace methods), we ran into another corner case where the refcount differs, and mixes up the refcount-based inference about whether we are being called in a "chained" context and thus have to raise the ChainedAssignment warning. (a previous case we discovered was when the setitem call is coming from cython code: https://github.com/pandas-dev/pandas/issues/51315)
Specifically, when passing
*args
or**kwargs
to the inplace method call, it takes a different execution path in the Python interpreter compared to calling it with manually specified arguments. And starting from Python 3.11, this other execution path gives one reference less.Example (with CoW, the
df
will never be updated with such chained method call):That warning works for all tested Python versions. However, when passing the
args
orkwargs
, the warning doesn't work on Python >= 3.11:For now we decided to punt on this specific corner case, but creating this issue to we keep track of what we learned and have something to reference to when this might come up later.