Closed colesbury closed 2 months ago
Is throwing a lock on all modifications of the cache a valid solution? Or would that be too slow?
The other option is to "register" a cache in a linked list in thread state. Then have that be cleared on shutdown. I like this better.
A lock would mean that concurrent accesses to ContextVars would not scale well. I'd be more tempted to just disable the cache in the free-threaded build and pay the increased access cost, but still scale well.
Can you explain your linked list idea more?
Change
struct _pycontextvarobject {
PyObject_HEAD
PyObject *var_name;
PyObject *var_default;
PyObject *var_cached;
uint64_t var_cached_tsid;
uint64_t var_cached_tsver;
Py_hash_t var_hash;
};
to
struct _pycontextvarobject {
PyObject_HEAD
PyObject *var_name;
PyObject *var_default;
_PyCtxVarCacheEntry *var_entry;
Py_hash_t var_hash;
};
where var_entry points to an entry in an array (something like a freelist).
Said array is managed similar to pycore_freelists.h.
The array looks like this:
| Entry 1 | | Entry 2 | | Free Entry 1 | | Entry 3 | | Free Entry 2 |
| var_cached | | var_cached | | ---> next | | var_cached | | ---> NULL |
| var_cached_tsid | | var_cached_tsid | | | | var_cached_tsid | | |
| var_cached_tsver | | var_cached_tsver | | | | var_cached_tsver | | |
Flow:
Therefore, most reads (cache hits) should scale well, only cache misses and writes (sets) scale badly.
I'm not too sure how much of the HAMT code is thread-safe. So I have no clue whether we need additional locks there too. A cursory read suggests to me we should at least lock _PyHamt_Assoc
in contextvar_set
if I understand the code correctly. While I don't think it's easy to cause any races in the HAMT implementation from Python, it seems easier with the C API introduced in PEP 567.
This seems like a complicated scheme and would still not scale well for typical usage patterns. The assumption that most accesses are cache hits is probably not true for a multi-threaded program. It's a plausible assumption with the GIL because you have coarse grained accesses: one thread runs at a time and may do lots of context var accesses within that time slice. Without the GIL, that's probably not true: a context var is likely to have many threads access it in a highly interleaved (or concurrent) pattern.
Contexts are per-thread. The lookup is on the current thread's context, and a context can only be active on a single thread at a time. HAMTs are persistent, in other words the contents are immutable, and mutations create a new data structure (with efficient sharing).
HAMTs are persistent, in other words the contents are immutable, and mutations create a new data structure (with efficient sharing)
Oops yea I forgot/didn't know that HAMTs are copy on write. Thanks for pointing that out. In that case I'll send a PR to disable the cache on free-threaded builds.
I think this was addressed by @Fidget-Spinner in https://github.com/python/cpython/pull/121740
Bug report
The ContextVar implementation is not thread-safe in the free-threaded build:
In particular, the caching is not thread-safe, although there may be other thread-safety issues as well:
https://github.com/python/cpython/blob/9c08f40a613d9aee78de4ce4ec3e125d1496d148/Python/context.c#L206-L212
Example Test Cases:
./python -m test test_context -m test_context_threads_1
./python -m test test_decimal -m test_threading
Linked PRs