python / cpython

The Python programming language
https://www.python.org
Other
63.46k stars 30.39k forks source link

Check that new heap types cannot be created uninitialised: add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag #88082

Closed pablogsal closed 2 years ago

pablogsal commented 3 years ago
BPO 43916
Nosy @vstinner, @tiran, @serhiy-storchaka, @pablogsal, @miss-islington, @erlend-aasland, @shreyanavigyan
PRs
  • python/cpython#25653
  • python/cpython#25721
  • python/cpython#25722
  • python/cpython#25733
  • python/cpython#25745
  • python/cpython#25748
  • python/cpython#25749
  • python/cpython#25750
  • python/cpython#25751
  • python/cpython#25753
  • python/cpython#25768
  • python/cpython#25791
  • python/cpython#25854
  • python/cpython#26412
  • python/cpython#26888
  • Files
  • patch.diff
  • disablednew.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['expert-C-API', 'type-bug', '3.9'] title = 'Check that new heap types cannot be created uninitialised: add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag' updated_at = user = 'https://github.com/pablogsal' ``` bugs.python.org fields: ```python activity = actor = 'vstinner' assignee = 'none' closed = False closed_date = None closer = None components = ['C API'] creation = creator = 'pablogsal' dependencies = [] files = ['49989', '49992'] hgrepos = [] issue_num = 43916 keywords = ['patch'] message_count = 72.0 messages = ['391634', '391635', '391903', '391905', '391909', '391910', '391911', '391912', '391913', '391917', '391924', '391925', '391926', '391928', '391929', '391931', '391934', '391936', '391937', '391984', '391992', '391999', '392036', '392040', '392042', '392044', '392046', '392048', '392057', '392169', '392170', '392297', '392414', '392426', '392427', '392433', '392434', '392435', '392437', '392438', '392451', '392456', '392462', '392464', '392465', '392472', '392473', '392526', '392528', '392534', '392537', '392545', '392546', '392553', '392554', '392556', '392558', '392570', '392621', '392632', '392635', '392811', '394571', '394582', '394603', '396200', '396476', '411785', '411786', '411865', '411867', '411874'] nosy_count = 7.0 nosy_names = ['vstinner', 'christian.heimes', 'serhiy.storchaka', 'pablogsal', 'miss-islington', 'erlendaasland', 'shreyanavigyan'] pr_nums = ['25653', '25721', '25722', '25733', '25745', '25748', '25749', '25750', '25751', '25753', '25768', '25791', '25854', '26412', '26888'] priority = None resolution = None stage = 'patch review' status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue43916' versions = ['Python 3.9'] ```

    pablogsal commented 3 years ago

    I'm creating this issue and marking it as a deferred blocker to make sure we don't forget to check this (adding tests) before the release.

    Check this message from Serhiy regarding heap typed concerns:

    https://bugs.python.org/msg391598

    pablogsal commented 3 years ago

    Serhiy's comment for reference:

    Static type with tp_new = NULL does not have public constructor, but heap type inherits constructor from base class. As result, it allows to create instances without proper initialization, that can lead to crash. It was fixed for few standard heap types in bpo-23815, then reintroduced, then fixed again in bpo-42694. But it should be checked for every type without constructor.

    serhiy-storchaka commented 3 years ago

    From Objects/typeobject.c:

        /* The condition below could use some explanation.
           It appears that tp_new is not inherited for static types
           whose base class is 'object'; this seems to be a precaution
           so that old extension types don't suddenly become
           callable (object.__new__ wouldn't insure the invariants
           that the extension type's own factory function ensures).
           Heap types, of course, are under our control, so they do
           inherit tp_new; static extension types that specify some
           other built-in type as the default also
           inherit object.__new__. */
        if (base != &PyBaseObject_Type ||
            (type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
            if (type->tp_new == NULL)
                type->tp_new = base->tp_new;
        }
    serhiy-storchaka commented 3 years ago

    The explaining comment was added in f884b749103bad0724e2e4878cd5fe49a41bd7df.

    The code itself is much older. It was originally added in 6d6c1a35e08b95a83dbe47dbd9e6474daff00354, then weaked in c11e192d416e2970e6a06cf06d4cf788f322c6ea and strengthen in 29687cd2112c540a8a4d31cf3b191cf10db08412.

    All types except static types which are direct descendants of object inherit tp_new from a base class.

    We need to check which static types had tp_new = NULL before they were converted to heap types and set it to NULL manually after type creation.

    erlend-aasland commented 3 years ago

    We need to check which static types had tp_new = NULL before they were converted to heap types

    I did a quick git grep. Results pasted into the list over at https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403.

    serhiy-storchaka commented 3 years ago

    It is incomplete. For example arrayiterator did have tp_new = NULL (in other words, PyArrayIter_Type.tp_new was not set to non-NULL value).

    In 3.9:

    >>> import array
    >>> type(iter(array.array('I')))()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: cannot create 'arrayiterator' instances

    In 3.10:

    >>> import array
    >>> type(iter(array.array('I')))()
    <array.arrayiterator object at 0x7f02f88db5c0>
    erlend-aasland commented 3 years ago

    It is incomplete.

    Yes, I know. I'm working on completing it manually. Some of the static structs were only partially initialised (most stop at tp_members). The rest of the struct (including tp_new) would then be initialised to zero.

    serhiy-storchaka commented 3 years ago

    Thank you for your work Erlend.

    pablogsal commented 3 years ago

    Serhiy, do you think this should be a release blocker for the beta?

    erlend-aasland commented 3 years ago

    Thank you for your work Erlend.

    Anytime! I've updated the list now. There may still be errors, but I think it's pretty accurate now.

    erlend-aasland commented 3 years ago

    Here's a plaintext version:

    erlend-aasland commented 3 years ago

    Is there any way we can fix this in Objects/typeobject.c, or do we actually have to patch every single type manually?

    pablogsal commented 3 years ago

    Is there any way we can fix this in Objects/typeobject.c, or do we actually have to patch every single type manually?

    That would be probably a bad idea since typeobject.c is supposed to be generic. We would be inverting the responsibility concerns to the base.

    serhiy-storchaka commented 3 years ago

    I afraid that changing the corresponding code in Objects/typeobject.c will break existing user code. For now, it is safer to patch every single type manually. Later we can design a general solution.

    pablogsal commented 3 years ago

    I am marking this as a release blocker, giving that this is probably going to involve a big change.

    erlend-aasland commented 3 years ago

    Is it worth it adding tests for this? I'm thinking a generic version of msg391910 (maybe Lib/test/test_uninitialised_heap_types.py) coupled with a dataset.

    tiran commented 3 years ago

    I have had this workaround in _hashopenssl.c for a while. Can I get rid of the function with "{Py_tp_new, NULL}" or is there a more generic way to accomplish the same goal?

    /* {Py_tp_new, NULL} doesn't block __new__ */
    static PyObject *
    _disabled_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
    {
        PyErr_Format(PyExc_TypeError,
            "cannot create '%.100s' instances", _PyType_Name(type));
        return NULL;
    }
    serhiy-storchaka commented 3 years ago

    test_uninitialised_heap_types.py will need to import unrelited modules (some of which can be platform depending). And more modules will be added as more types be converted. I think it is better to add tests for different modules in corresponding module test files. They are pretty trivial, for example:

        def test_new_tcl_obj(self):
            self.assertRaises(TypeError, _tkinter.Tcl_Obj)
    @requires_curses_func('panel')
    def test_new_curses_panel(self):
        w = curses.newwin(10, 10)
        panel = curses.panel.new_panel(w)
        self.assertRaises(TypeError, type(panel))
    erlend-aasland commented 3 years ago

    You're right, that would become messy. Complementing existing test suites is a better approach.

    erlend-aasland commented 3 years ago

    Christian:

    Can I get rid of the function with "{Py_tp_new, NULL}" [...]

    Unfortunately not. The workaround in 993e88cf08994f7c1e0f9f62fda4ed32634ee2ad does the trick though.

    erlend-aasland commented 3 years ago

    Pablo, I've made a patch for most of these (I've left out Christian's modules). I've only added a couple of tests for now. Do you want all in one PR?

    $ git diff main --stat  
     Lib/test/test_array.py             | 4 ++++
     Lib/test/test_unicodedata.py       | 4 ++++
     Modules/_dbmmodule.c               | 1 +
     Modules/_functoolsmodule.c         | 2 ++
     Modules/_gdbmmodule.c              | 1 +
     Modules/_sre.c                     | 5 +++++
     Modules/_threadmodule.c            | 2 ++
     Modules/_winapi.c                  | 1 +
     Modules/arraymodule.c              | 1 +
     Modules/cjkcodecs/multibytecodec.c | 1 +
     Modules/posixmodule.c              | 2 ++
     Modules/pyexpat.c                  | 1 +
     Modules/unicodedata.c              | 1 +
     Modules/zlibmodule.c               | 2 ++
     14 files changed, 28 insertions(+)

    I don't know why I included the sqlite3 and select types in msg391924; they are not affected by this issue. Somebody should double check that everything's covered.

    pablogsal commented 3 years ago

    Pablo, I've made a patch for most of these (I've left out Christian's modules). I've only added a couple of tests for now. Do you want all in one PR?

    Yes, please. The second most important part here is also adding regression tests so this doesn't happen. These tests should also be added when new types are converted in the future.

    erlend-aasland commented 3 years ago

    These tests should also be added when new types are converted in the future.

    We should have a checklist for static to heap type conversion. I'm pretty sure I've seen something like that in one of Victor's blogs. A new section in the Dev Guide, maybe?

    erlend-aasland commented 3 years ago

    Alternative approach:

    Add _PyType_DisabledNew to Include/cpython, and add {Py_tp_new, _PyType_DisabledNew} slot to affected types. Diff attached.

    See GH discussion:

    tiran commented 3 years ago

    LGTM

    serhiy-storchaka commented 3 years ago

    Alternatively we can make PyType_FromSpec() setting tp_new to NULL if there is explicit {Py_tp_new, NULL} in slots.

    erlend-aasland commented 3 years ago

    Alternatively we can make PyType_FromSpec() setting tp_new to NULL if there is explicit {Py_tp_new, NULL} in slots.

    Yes. It's not as explicit (and self-documenting) as _PyType_DisabledNew, though. But it avoids expanding the C API.

    erlend-aasland commented 3 years ago

    For that to work, you'd have to add a flag in PyType_FromModuleAndSpec, set the flag if (slot->slot == Py_tp_new && slot->pfunc == NULL) in the slots for loop, and then reset tpnew _after PyType_Ready.

    erlend-aasland commented 3 years ago

    The second most important part here is also adding regression tests so this doesn't happen.

    I've started adding tests. I'm not sure if I need to wrap them with @cpython_only.

    Some of the types are hidden deep inside the implementations, and I have a hard time fetching them. For example _functools._lru_list_elem and _thread._localdummy. Is it ok to leave those out?

    serhiy-storchaka commented 3 years ago

    Is it ok to skip tests if it is too hard to write them.

    Tests should be decorated with @cpython_only because other Python implementations can have working constructors for these types. It is an implementation detail that these types are implemented in C and instances are not properly initialized by default.

    erlend-aasland commented 3 years ago

    Thanks, Serhiy.

    erlend-aasland commented 3 years ago

    Serhiy, would you mind reviewing the PR? bpo-43974 will clean up the changes introduced to setup.py.

    vstinner commented 3 years ago

    New changeset 3bb09947ec4837de75532e21dd4bd25db0a1f1b7 by Victor Stinner in branch 'master': bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag (GH-25721) https://github.com/python/cpython/commit/3bb09947ec4837de75532e21dd4bd25db0a1f1b7

    vstinner commented 3 years ago

    New changeset 0cad068ec174bbe33fb80460da56eb413f3b9359 by Victor Stinner in branch 'master': bpo-43916: Remove _disabled_new() function (GH-25745) https://github.com/python/cpython/commit/0cad068ec174bbe33fb80460da56eb413f3b9359

    vstinner commented 3 years ago

    New changeset 4908fae3d57f68694cf006e89fd7761f45003447 by Victor Stinner in branch 'master': bpo-43916: PyStdPrinter_Type uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25749) https://github.com/python/cpython/commit/4908fae3d57f68694cf006e89fd7761f45003447

    vstinner commented 3 years ago

    New changeset 387397f8a4244c983f4568c16a28842e3268fe5d by Erlend Egeberg Aasland in branch 'master': bpo-43916: select.poll uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25750) https://github.com/python/cpython/commit/387397f8a4244c983f4568c16a28842e3268fe5d

    vstinner commented 3 years ago

    New changeset 9746cda705decebc0ba572d95612796afd06dcd4 by Erlend Egeberg Aasland in branch 'master': bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748) https://github.com/python/cpython/commit/9746cda705decebc0ba572d95612796afd06dcd4

    erlend-aasland commented 3 years ago

    Pablo & Serhiy, can we close this issue now?

    vstinner commented 3 years ago

    Currently, 33 types explicitly set the Py_TPFLAGS_DISALLOW_INSTANTIATION flag:

    ---

    Static types with tp_base=NULL (or tp_base=&PyBaseObject_Type) and tp_new=NULL get Py_TPFLAGS_DISALLOW_INSTANTIATION flag automatically. Example of 70 static types which gets this flag automatically:

    vstinner commented 3 years ago

    I remove the release blocker priority of this issue, since the main issue has been fixed.

    vstinner commented 3 years ago

    According to msg391924, more types should use Py_TPFLAGS_DISALLOW_INSTANTIATION:

    vstinner commented 3 years ago

    I created PR 25753.

    vstinner commented 3 years ago

    It seems like _sqlite3 heap types support regular instantiation and so that no change is needed.

    $ ./python
    Python 3.10.0a7+
    >>> import _sqlite3
    >>> conn=_sqlite3.Connection(":memory:")
    
    >>> type(conn)()
    TypeError: function missing required argument 'database' (pos 1)
    >>> conn2 = type(conn)(":memory:")
    
    >>> list(conn.execute("SELECT 1;"))
    [(1,)]
    >>> list(conn2.execute("SELECT 1;"))
    [(1,)]
    vstinner commented 3 years ago

    New changeset 7dcf0f6db38b7ab6918608542d3a0c6da0a286af by Victor Stinner in branch 'master': bpo-43916: select.devpoll uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25751) https://github.com/python/cpython/commit/7dcf0f6db38b7ab6918608542d3a0c6da0a286af

    erlend-aasland commented 3 years ago

    Yes, the sqlite3 types are fine. I don't know why I included them in my earlier post.

    I cleared the tp_new markings for the sqlite3 types on the Discourse post, since I can edit my posts over there, contrary to here on bpo :)

    vstinner commented 3 years ago

    New changeset 665c7746fca180266b121183c2d5136c547006e0 by Victor Stinner in branch 'master': bpo-43916: _md5.md5 uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25753) https://github.com/python/cpython/commit/665c7746fca180266b121183c2d5136c547006e0

    vstinner commented 3 years ago

    bpo-43916: _md5.md5 uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25753)

    With this change, all tests listed in this issue should be fixed.

    But I didn't check the whole list of https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403

    vstinner commented 3 years ago

    Status in Python 3.9.

    5 types allow instantiation whereas they must not: BUG!!!

    Example of bug:

    $ python3.9
    Python 3.9.4
    >>> import zlib
    >>> obj=zlib.compressobj()
    >>> obj2=type(obj)()
    >>> obj2.copy()
    Erreur de segmentation (core dumped)

    The remaining types disallow instantiation with different ways.

    Static type with tp_base=NULL and tp_new=NULL:

    Heap type setting tp_new to NULL after type creation:

    Static type setting tp_new to NULL after type creation:

    Define a tp_new or tp_init function which always raise an exception:

    vstinner commented 3 years ago

    Oops, sorry: in Python 3.9, tp_new is set to NULL after these heap types are created. There is no bug for these 3 types.

    vstinner commented 3 years ago

    I checked " Types converted pre Python 3.9" and "Types converted in Python 3.9" of: https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403

    (I didn't check "Types converted in Python 3.10" yet.)

    These lists are not complete, for example select.kevent was not listed whereas it has a bug.

    _struct.unpack_iterator can be modified to use Py_TPFLAGS_DISALLOW_INSTANTIATION: its tp_new method always raises an exception.

    Bugs!

    Need review:

    Implement tp_new (ok!):

    Other cases (ok!):