Keeping 'stack_pointer` in sync with frame->stacktop is possibly too tricky

ericsnowcurrently commented 2 years ago

While considering the situation of gh-94215, I realized there is a broader concern worth discussing, of which that issue is a specific instance. FWIW, this isn't some massive, pervasive, nor high-priority problem that needs immediate attention. It is worth consideration at some point, though, as it's bitten us already and I expect it will bite us again.

Context:

In the eval loop (in ceval.c), we copy some interpreter/thread/frame state into local variables, to reduce overhead a little. Of these variables, several get modified during execution. One of them, stack_pointer, must be kept in sync with the frame->stacktop, which can be modified via _PyFrame_SetStackPointer(), _PyFrame_StackPop(), and _PyFrame_StackPush(). We pass stack_pointer in various calls (e.g. vectorcall) throughout the eval loop, so it is important that it stay in sync, at least at the points where we use it. (I covered this a bit more in the other issue.)

The Concern:

From where I'm at, it's hard to have a lot of confidence that we are preserving the invaraiant (that stack_pointer is properly in sync with the frame state) in all cases. There doesn't seem to be any structural mechanism by which we get guarantees (relative to use of _PyFrame_SetStackPointer(), etc.). Instead it seems like knowing when to synchronize stack_pointer, and when we can avoid unnecessary synchronization, currently requires deep knowledge of a lot of pieces.

Consequently, it seems like it is relatively easy to break the invariant accidentally. The fact that it happened to someone as knowledgeable as @markshannon really says a lot.

Am I out of touch here? If not, is it worth doing something about all that?

CC @iritkatriel @tiran @markshannon @pablogsal

ericsnowcurrently commented 2 years ago

Some possible "solutions":

leave it alone (the performance benefit is enough to offset the very infrequent mistakes?)
add tests that catch the mistakes somehow
add asserts to every use of stack_pointer to check synchronization
add equivalent checks under Py_DEBUG
drop the local variable and use _PyFrame_GetStackPointer() directly instead
add stack_pointer = _PyFrame_SetStackPointer(frame); right before every call/use of stack_pointer (which makes the local irrelevant)
add some mechanism that ties _PyFrame_SetStackPointer(), etc. to the local variable somehow (unrealistic)

encukou commented 2 years ago

For starters, it would be nice to properly define the invariant. There are obviously cases when stack_pointer goes out of sync with frame->stacktop, and currently it's not clear if they are intentional – and if so, when the eventual sync is expected. Same goes for next_instr/frame->prev_instr (and anything else that'll be made local in the future).

markshannon commented 2 years ago

The invariant is well defined. For years, the invariant has been that if the stack-pointer stored in the frame was invalid (used to be NULL, now its -1) then the stack pointer was being handled in the interpreter, _PyEval_EvalFrameDefault.

https://github.com/python/cpython/blob/main/Python/ceval.c#L1706 explains why the in-memory stack offset is made invalid when the stack pointer is held in a register.

There are obviously cases when stack_pointer goes out of sync with frame->stacktop

When are they out of sync?

markshannon commented 2 years ago

I should say that the stack pointer is (along with the instruction pointer) the hottest variable int the entire VM and needs to be in a register. No harm in adding more asserts, though.

ericsnowcurrently commented 2 years ago

When are they out of sync?

See #94215 (and my comment at https://github.com/python/cpython/issues/94215#issuecomment-1178073348).

markshannon commented 2 years ago

Where in the code?

They can only be out of sync when both are valid and reachable. In other words, it can only happen in _PyEval_EvalFrameDefault when frame->stacktop >= 0. That should only happen in a few places dealing with tracing and the like.

brandtbucher commented 2 years ago

Looking at the code, it appears that we don't set frame->stacktop = -1; after TRACE_FUNCTION_ENTRY();. I don't know if this is intentional or not, but it seems incorrect to me.

python / cpython

Keeping 'stack_pointer` in sync with frame->stacktop is possibly too tricky #94666