python / cpython

The Python programming language
https://www.python.org
Other
62.16k stars 29.88k forks source link

PEP 205: weak references implementation #33699

Closed freddrake closed 23 years ago

freddrake commented 23 years ago
BPO 403203
Nosy @gvanrossum, @tim-one, @loewis, @freddrake, @nascheme
Files
  • None: None
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/freddrake' closed_at = created_at = labels = ['interpreter-core'] title = 'PEP 205: weak references implementation' updated_at = user = 'https://github.com/freddrake' ``` bugs.python.org fields: ```python activity = actor = 'fdrake' assignee = 'fdrake' closed = True closed_date = None closer = None components = ['Interpreter Core'] creation = creator = 'fdrake' dependencies = [] files = ['3013'] hgrepos = [] issue_num = 403203 keywords = ['patch'] message_count = 19.0 messages = ['35275', '35276', '35277', '35278', '35279', '35280', '35281', '35282', '35283', '35284', '35285', '35286', '35287', '35288', '35289', '35290', '35291', '35292', '35293'] nosy_count = 6.0 nosy_names = ['gvanrossum', 'tim.peters', 'nobody', 'loewis', 'fdrake', 'nascheme'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue403203' versions = [] ```

    freddrake commented 23 years ago

    Patch to implement PEP-205.

    This patch contains an implementation of the basic machinery and makes instance weakly referencable (as a useful example), and documentation.

    The test suite is still being written.

    3772858d-27d8-44b0-a664-d68674859f36 commented 23 years ago

    Neil,

    you may want to have a look at mxProxy. This is a proxy implementation which already provides weak references (among other things) and has the advantages you talked about in your reply.

    -- Marc-Andre

    freddrake commented 23 years ago

    Assigned to Martin, since he's indicated he intends to go over this very carefully!

    freddrake commented 23 years ago

    Updated patch to also support proxy semantics using the same invalidation machinery as the basic weak reference type. The additional code is *heavily* borrowed from Neil Schemenauer.

    freddrake commented 23 years ago

    I'm not entirely sure why one would want weak keys rather than weak values, but I suspect it has to do with expectations of how the dict will be used in applications. I don't think it would be hard to support both flavors of dictionaries using the primitive reference objects.

    The constructor does not check that all incoming values are weak because that isn't required. It uses self.update() rather than self.data.update() to perform the inclusion, so any mapping that supports .keys() and .__getitem__() is sufficient. Some additional checking could be done to make sure that all added entries can be created before inserting any of them, offering more atomicity.

    .get() will be fixed in the next version of the patch; thanks!

    I've added GC support, but don't have it working quite right. I'm trying to isolate the failure now.

    freddrake commented 23 years ago

    Added a new (but broken!) version of the patch that fixes WeakDictionary.get() but doesn't interact properly with GC (appearantly) -- not usable as is:

    http://starship.python.net/crew/fdrake/patches/weakref.patch-3

    Neil made the mistake of volunteering to look at the current problem, so assigning to him. (Thanks!)

    freddrake commented 23 years ago

    New version of the patch that's still broken but doesn't conflict with the rich comparisons changes:

    http://starship.python.net/crew/fdrake/patches/weakref.patch-4

    freddrake commented 23 years ago

    Ok, I tihnk this is it!

    http://starship.python.net/crew/fdrake/patches/weakref.patch-5

    This patch fixes the compilation problems from the previoous patch related to the introduction of rich comparisons, and makes a change in the way new objects are initialized by PyObject_New() and PyObject_NEW() (and all their friends!) -- those functions now know to initialize the weak reference list to NULL for objects which support weak references.

    Without this change, more places in the code (especially extension modules) need to know more about initialization of weakly-referencable objects. For instance, the "cPickle" and "new" modules both can create instances without calling the code in classobject.c (using PyObject_New()), causing core dumps when such instances are deallocated if the weak-reference list for instances was not properly initialized. Since these modules (properly) tried to avoid call PyInstance_New(), core dumps ensued and the world fell apart. By moving the initialization of the weak reference list into the core object allocators, this problem is averted and extension code to construct new weakly referencable types is simplified.

    The documentation still needs work, and there are a few new entries for the Python/C API manual as well.

    A new constructor for instances, PyInstance_NewRaw(), will be added in a separate patch.

    freddrake commented 23 years ago

    Removed the .clear() method from reference objects.

    Made proxy objects unhashable -- the semantics of hashing something that can mutate in fundamental ways is pretty painful; just don't use these as dictionary keys!

    Added strong warnings to the documentation that proxy semantics may change a bit before the final release that the weak dictionaries are unclear -- additional feedback is requested. These are not thread-safe.

    freddrake commented 23 years ago

    Proxy objects still need support for rich comparisons.

    (PEP-205 has also been updated.)

    freddrake commented 23 years ago

    Checked in at Tim's urging. If this isn't quite right, file a bug report.

    gvanrossum commented 23 years ago

    I think callbacks are needed to implement various forms of weak dicts: when the object is deleted, whatever's in the weak dict for it should be deleted as well.

    Even though we may not provide weak dicts in 2.1a1, we should still support writing them using what we *do* provide.

    IMHO (I still haven't had the time to read the PEP or Fred's code. :-( )

    gvanrossum commented 23 years ago

    Grabbing this for code review (finally!)

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 23 years ago

    The weak dictionary data type needs more clarification in documentation and implementation. In java.util.WeakHashMap, they keys are weak, not the values (i.e. once the key is forgotten, the entry is cleared). In http://www.handshake.de/~dieter/weakdict.html, the values are weak, but it raises KeyErrors for entries that are collected.

    The initializer of a weak dict does not check the invariant (all values are weak).

    get() of a weakdict does not support an optional second parameter.

    weakref and friends need to participate in garbage collection, as the reference to the callback can create a cycle.

    61337411-43fc-4a9c-b8d5-4060aede66d0 commented 23 years ago

    The patch looks ok in general, pending resolution of the issues on the PEP itself. Some nits:

    Integration with GC is still fuzzy. If an instance is GC collected and had a weak ref to it and the weak ref had a callback and the callback choses to resurrect the object, semantics is unclear. This is equivalent to the object having an __del__; the GC would normally *not* collect such objects, but put them into gc.garbage. Proposed resolution: it is a fatal error to resurrect an object in a weakref callback.

    Hashing: it appears that the hash of a weakref can change when the underlying object goes away. If the weakref is used as key in a dictionary, it appears to be impossible to get the key out of the dictionary after the underlying object is gone. Proposed resolution: a proxy weak ref should cache the hash of the underlying object after its hash function was called for the first time.

    Comparing: why is the return value of tp_compare -1 when try_3way_compare expects -2 on error? Shouldn't there be some exception set when the underlying object goes away?

    tim-one commented 23 years ago

    Just mentioning a general scenario in which weak keys are the bee's knees: sometimes you want to add attributes to objects dynamically, in ways the original class author(s -- may cut across many classes! even non-instance objects; or objects from extension modules that don't support setattr) never anticipated. Like maybe a German translation of the object's docstrings; or a list of JPEG contents referenced by a URLish object; or a table of usage statistics on methods and functions; etc. Then it's natural to use the conceptual attr name as the name of a dict instead, and write

    attr[object] = whatever

    instead of

        object.attr = whatever

    There's no point keeping the "whatever" around when the object goes away, but a regular dict makes the object and its "whatever" immortal. A dict with weak keys is exactly on-target.

    nascheme commented 23 years ago

    Hmm, I think this can be done more cleanly with a proxy now the the coercion stuff has been cleaned up. I will probably give it a go tonight. The advantage of using a proxy is that you can have a weak reference to any object and you only pay memory for weak references if you use them.

    nascheme commented 23 years ago

    The magic test is:

    ./python Lib/test/regrtest.py test_weakref test_new

    Try adding this patch:

    --- Modules/newmodule.c 2000/11/13 20:29:20     2.29
    +++ Modules/newmodule.c 2001/01/17 19:55:46
    @@ -24,6 +24,7 @@
            Py_INCREF(dict);
            inst->in_class = (PyClassObject *)klass;
            inst->in_dict = dict;
    +      inst->in_weakreflist = NULL;
            PyObject_GC_Init(inst);
            return (PyObject *)inst;
     }
    nascheme commented 23 years ago

    Why do weak references need callbacks? I see nothing in the PEP to justify this feature. If there is no good reason for them, I propose that the feature be removed for 2.1. Allowing callbacks may make it harder to implement weak references more efficiently later.