larryhastings / gilectomy

Gilectomy branch of CPython. Use "gilectomy" branch in git. Read the important, short README below!
Other
527 stars 43 forks source link

Ensure locks are always safely initialized #20

Open larryhastings opened 8 years ago

larryhastings commented 8 years ago

The OS X support for "primitivelock_t" (I'm renaming to "cheaplock_t") has this comment:

// An initialized flag is need because some futexes
// are 0-initialized
// e.g. list instances created via PyType_GenericNew

And in "futex_lock_primitive()" (I'm renaming to "cheaplock_lock()") there is this code:

if (!f->initialized) {
    futex_init_primitive(lock);
}

Q: Is it safe to initialize a mutex from inside the lock function? A: Uh, no.

What do we need to do so that list instances created via PyType_GenericNew (&c) are correctly, safely initialized?

larryhastings commented 8 years ago

@jpe @arp11 @Yhg1s

ericsnowcurrently commented 8 years ago

Considering that these fine-grained locks are becoming a rather ubiquitous feature, could we move initialization of the lock into PyType_GenericNew()?

Alternatively, types that currently have PyType_GenericNew in their tp_new slot could instead use a new _new func that wraps PyType_GenericNew() and initialized the lock. However, that does not help in the case that someone calls PyType_GenericNew() directly.

Presumably we're initializing the locks when we initialize (init) objects, which exposes the potential trouble. How likely is it for folks to create the object (new) but not initialize it? Regardless, I can certainly imagine it happening by accident, particularly when subclassing...

ericsnowcurrently commented 8 years ago

FTR, the problematic lazy lock init is in Include/lock_pthreads.h.

ericsnowcurrently commented 8 years ago

The following have tp_new set to PyType_GenericNew in at least one type:

Modules/_bz2module.c (2) Modules/_datetimemodule.c Modules/_decimal/_decimal.c Modules/_io/bufferedio.c (4) Modules/_io/iobase.c Modules/_io/textio.c (2) Modules/_lsprof.c Modules/_lzmamodule.c (2) Modules/_pickle.c (2) Modules/_testcapimodule.c (2) Modules/zipimport.c Objects/bytearrayobject.c Objects/descrobject.c Objects/funcobject.c (2) Objects/listobject.c Objects/moduleobject.c Objects/typeobject.c Python/Python-ast.c

It is also used ad-hoc in the following:

Doc/includes/noddy.c: noddy_NoddyType.tp_new = PyType_GenericNew; Modules/_ctypes/_ctypes.c: DictRemover_Type.tp_new = PyType_GenericNew; Modules/_json.c: PyScannerType.tp_new = PyType_GenericNew; Modules/_json.c: PyEncoderType.tp_new = PyType_GenericNew; Modules/selectmodule.c: kqueue_event_Type.tp_new = PyType_GenericNew; Modules/socketmodule.c: PyType_GenericNew(&sock_type, NULL, NULL); Modules/_sqlite/cache.c: pysqlite_NodeType.tp_new = PyType_GenericNew; Modules/_sqlite/cache.c: pysqlite_CacheType.tp_new = PyType_GenericNew; Modules/_sqlite/connection.c: pysqlite_ConnectionType.tp_new = PyType_GenericNew; Modules/_sqlite/cursor.c: pysqlite_CursorType.tp_new = PyType_GenericNew; Modules/_sqlite/prepare_protocol.c: pysqlite_PrepareProtocolType.tp_new = PyType_GenericNew; Modules/_sqlite/statement.c: pysqlite_StatementType.tp_new = PyType_GenericNew; Modules/xxlimited.c: Null_Type_slots[1].pfunc = PyType_GenericNew; Modules/xxmodule.c: Null_Type.tp_new = PyType_GenericNew;

jpe commented 8 years ago

Yes, the code in the pthread version of lock is wrong. The specific case that I ran into was that the list lock was not initialized because it relies on TyType_GenericNew 0'ing everything out. A minimal fix for this would be to move the if (!initialized) { init(); } lock initialization code into tp_init of the list. It might be worth doing it generally as Eric suggests because I suspect this may be an issue with other object types as well. I'll try to make a minimal fix later if no one beats me to it.

larryhastings commented 8 years ago

Eric: I'm trying to keep the lock mechanism 100% private. The only external API is ob_type->tp_lock/tp_unlock, and I have my fingers crossed it'll stay that way. That leaves us open to do crazy things inside the object, e.g. have some dicts support multiple-reader locking.

As I think we discussed in person, I'm hoping we can just add lock initialization support to tp_new. I'm not sure if we have to do it for literally every type with an internal lock, but even if that's what it takes, right now that's what sounds good to me. Obviously locks need to be initialized before they're used, and if some people sidestep tp_init and just call tp_new, then yes we need to always do it in tp_new.