Closed pablogsal closed 2 years ago
I'm creating this issue and marking it as a deferred blocker to make sure we don't forget to check this (adding tests) before the release.
Check this message from Serhiy regarding heap typed concerns:
Serhiy's comment for reference:
Static type with tp_new = NULL does not have public constructor, but heap type inherits constructor from base class. As result, it allows to create instances without proper initialization, that can lead to crash. It was fixed for few standard heap types in bpo-23815, then reintroduced, then fixed again in bpo-42694. But it should be checked for every type without constructor.
From Objects/typeobject.c:
/* The condition below could use some explanation.
It appears that tp_new is not inherited for static types
whose base class is 'object'; this seems to be a precaution
so that old extension types don't suddenly become
callable (object.__new__ wouldn't insure the invariants
that the extension type's own factory function ensures).
Heap types, of course, are under our control, so they do
inherit tp_new; static extension types that specify some
other built-in type as the default also
inherit object.__new__. */
if (base != &PyBaseObject_Type ||
(type->tp_flags & Py_TPFLAGS_HEAPTYPE)) {
if (type->tp_new == NULL)
type->tp_new = base->tp_new;
}
The explaining comment was added in f884b749103bad0724e2e4878cd5fe49a41bd7df.
The code itself is much older. It was originally added in 6d6c1a35e08b95a83dbe47dbd9e6474daff00354, then weaked in c11e192d416e2970e6a06cf06d4cf788f322c6ea and strengthen in 29687cd2112c540a8a4d31cf3b191cf10db08412.
All types except static types which are direct descendants of object inherit tp_new from a base class.
We need to check which static types had tp_new = NULL before they were converted to heap types and set it to NULL manually after type creation.
We need to check which static types had tp_new = NULL before they were converted to heap types
I did a quick git grep. Results pasted into the list over at https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403.
It is incomplete. For example arrayiterator did have tp_new = NULL (in other words, PyArrayIter_Type.tp_new was not set to non-NULL value).
In 3.9:
>>> import array
>>> type(iter(array.array('I')))()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'arrayiterator' instances
In 3.10:
>>> import array
>>> type(iter(array.array('I')))()
<array.arrayiterator object at 0x7f02f88db5c0>
It is incomplete.
Yes, I know. I'm working on completing it manually. Some of the static structs were only partially initialised (most stop at tp_members). The rest of the struct (including tp_new) would then be initialised to zero.
Thank you for your work Erlend.
Serhiy, do you think this should be a release blocker for the beta?
Thank you for your work Erlend.
Anytime! I've updated the list now. There may still be errors, but I think it's pretty accurate now.
Here's a plaintext version:
0b858cdd5d2021-01-04 Modules/cjkcodecs/multibytecodec.c
c8a87addb1 2021-01-04 Modules/pyexpat.c
75bf107c62 2021-01-02 Modules/arraymodule.c
dd39123970 2020-12-29 Modules/_functoolsmodule.c
6104013838 2020-12-18 Modules/_threadmodule.c
a6109ef68d 2020-11-20 Modules/_sre.c
c8c4200b65 2020-10-26 Modules/unicodedata.c
256e54acdb 2020-10-01 Modules/_sqlite
9031bd4fa4 2020-10-01 Modules/_sqlite
cb6db8b6ae 2020-09-27 Modules/_sqlite
52a2df135c 2020-09-08 Modules/sha256module.c
2aabc3200b 2020-09-06 Modules/md5module.c Modules/sha1module.c Modules/sha512module.c
e087f7cd43 2020-08-13 Modules/_winapi.c
c4862e333a 2020-06-17 Modules/_gdbmmodule.c
bf69a8f99f 2020-06-16 Modules/_dbmmodule.c
33f15a16d4 2019-11-05 Modules/posixmodule.c
df69e75edc 2019-09-25 Modules/_hashopenssl.c
f919054e53 2019-09-14 Modules/selectmodule.c
04f0bbfbed 2019-09-10 Modules/zlibmodule.c
4f384af067 2012-10-14 Modules/_tkinter.c
bc07cb883e 2012-06-14 Modules/_curses_panel.c
Is there any way we can fix this in Objects/typeobject.c, or do we actually have to patch every single type manually?
Is there any way we can fix this in Objects/typeobject.c, or do we actually have to patch every single type manually?
That would be probably a bad idea since typeobject.c is supposed to be generic. We would be inverting the responsibility concerns to the base.
I afraid that changing the corresponding code in Objects/typeobject.c will break existing user code. For now, it is safer to patch every single type manually. Later we can design a general solution.
I am marking this as a release blocker, giving that this is probably going to involve a big change.
Is it worth it adding tests for this? I'm thinking a generic version of msg391910 (maybe Lib/test/test_uninitialised_heap_types.py) coupled with a dataset.
I have had this workaround in _hashopenssl.c for a while. Can I get rid of the function with "{Py_tp_new, NULL}" or is there a more generic way to accomplish the same goal?
/* {Py_tp_new, NULL} doesn't block __new__ */
static PyObject *
_disabled_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
PyErr_Format(PyExc_TypeError,
"cannot create '%.100s' instances", _PyType_Name(type));
return NULL;
}
test_uninitialised_heap_types.py will need to import unrelited modules (some of which can be platform depending). And more modules will be added as more types be converted. I think it is better to add tests for different modules in corresponding module test files. They are pretty trivial, for example:
def test_new_tcl_obj(self):
self.assertRaises(TypeError, _tkinter.Tcl_Obj)
@requires_curses_func('panel')
def test_new_curses_panel(self):
w = curses.newwin(10, 10)
panel = curses.panel.new_panel(w)
self.assertRaises(TypeError, type(panel))
You're right, that would become messy. Complementing existing test suites is a better approach.
Christian:
Can I get rid of the function with "{Py_tp_new, NULL}" [...]
Unfortunately not. The workaround in 993e88cf08994f7c1e0f9f62fda4ed32634ee2ad does the trick though.
Pablo, I've made a patch for most of these (I've left out Christian's modules). I've only added a couple of tests for now. Do you want all in one PR?
$ git diff main --stat
Lib/test/test_array.py | 4 ++++
Lib/test/test_unicodedata.py | 4 ++++
Modules/_dbmmodule.c | 1 +
Modules/_functoolsmodule.c | 2 ++
Modules/_gdbmmodule.c | 1 +
Modules/_sre.c | 5 +++++
Modules/_threadmodule.c | 2 ++
Modules/_winapi.c | 1 +
Modules/arraymodule.c | 1 +
Modules/cjkcodecs/multibytecodec.c | 1 +
Modules/posixmodule.c | 2 ++
Modules/pyexpat.c | 1 +
Modules/unicodedata.c | 1 +
Modules/zlibmodule.c | 2 ++
14 files changed, 28 insertions(+)
I don't know why I included the sqlite3 and select types in msg391924; they are not affected by this issue. Somebody should double check that everything's covered.
Pablo, I've made a patch for most of these (I've left out Christian's modules). I've only added a couple of tests for now. Do you want all in one PR?
Yes, please. The second most important part here is also adding regression tests so this doesn't happen. These tests should also be added when new types are converted in the future.
These tests should also be added when new types are converted in the future.
We should have a checklist for static to heap type conversion. I'm pretty sure I've seen something like that in one of Victor's blogs. A new section in the Dev Guide, maybe?
Alternative approach:
Add _PyType_DisabledNew to Include/cpython, and add {Py_tp_new, _PyType_DisabledNew} slot to affected types. Diff attached.
See GH discussion:
LGTM
Alternatively we can make PyType_FromSpec() setting tp_new to NULL if there is explicit {Py_tp_new, NULL} in slots.
Alternatively we can make PyType_FromSpec() setting tp_new to NULL if there is explicit {Py_tp_new, NULL} in slots.
Yes. It's not as explicit (and self-documenting) as _PyType_DisabledNew, though. But it avoids expanding the C API.
For that to work, you'd have to add a flag in PyType_FromModuleAndSpec, set the flag if (slot->slot == Py_tp_new && slot->pfunc == NULL) in the slots for loop, and then reset tpnew _after PyType_Ready.
The second most important part here is also adding regression tests so this doesn't happen.
I've started adding tests. I'm not sure if I need to wrap them with @cpython_only.
Some of the types are hidden deep inside the implementations, and I have a hard time fetching them. For example _functools._lru_list_elem and _thread._localdummy. Is it ok to leave those out?
Is it ok to skip tests if it is too hard to write them.
Tests should be decorated with @cpython_only because other Python implementations can have working constructors for these types. It is an implementation detail that these types are implemented in C and instances are not properly initialized by default.
Thanks, Serhiy.
Serhiy, would you mind reviewing the PR? bpo-43974 will clean up the changes introduced to setup.py.
New changeset 3bb09947ec4837de75532e21dd4bd25db0a1f1b7 by Victor Stinner in branch 'master': bpo-43916: Add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag (GH-25721) https://github.com/python/cpython/commit/3bb09947ec4837de75532e21dd4bd25db0a1f1b7
New changeset 0cad068ec174bbe33fb80460da56eb413f3b9359 by Victor Stinner in branch 'master': bpo-43916: Remove _disabled_new() function (GH-25745) https://github.com/python/cpython/commit/0cad068ec174bbe33fb80460da56eb413f3b9359
New changeset 4908fae3d57f68694cf006e89fd7761f45003447 by Victor Stinner in branch 'master': bpo-43916: PyStdPrinter_Type uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25749) https://github.com/python/cpython/commit/4908fae3d57f68694cf006e89fd7761f45003447
New changeset 387397f8a4244c983f4568c16a28842e3268fe5d by Erlend Egeberg Aasland in branch 'master': bpo-43916: select.poll uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25750) https://github.com/python/cpython/commit/387397f8a4244c983f4568c16a28842e3268fe5d
New changeset 9746cda705decebc0ba572d95612796afd06dcd4 by Erlend Egeberg Aasland in branch 'master': bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748) https://github.com/python/cpython/commit/9746cda705decebc0ba572d95612796afd06dcd4
Pablo & Serhiy, can we close this issue now?
Currently, 33 types explicitly set the Py_TPFLAGS_DISALLOW_INSTANTIATION flag:
---
Static types with tp_base=NULL (or tp_base=&PyBaseObject_Type) and tp_new=NULL get Py_TPFLAGS_DISALLOW_INSTANTIATION flag automatically. Example of 70 static types which gets this flag automatically:
I remove the release blocker priority of this issue, since the main issue has been fixed.
According to msg391924, more types should use Py_TPFLAGS_DISALLOW_INSTANTIATION:
_md5.md5
_sha1.sha1
_sha256.sha224
_sha256.sha256
_sha512.sha384
_sha512.sha512
select.devpoll: I created PR 25751
select.kevent: it implements a working tp_new method: "{Py_tp_new, select_kqueue},"
sqlite3.Cache
sqlite3.Connection
sqlite3.Cursor
sqlite3.Node
sqlite3.Row
sqlite3.Statement
I created PR 25753.
It seems like _sqlite3 heap types support regular instantiation and so that no change is needed.
sqlite3.Cache
{Py_tp_new, PyType_GenericNew},
sqlite3.Connection
{Py_tp_new, PyType_GenericNew}, {Py_tp_init, pysqlite_connection_init},
sqlite3.Cursor
{Py_tp_new, PyType_GenericNew}, {Py_tp_init, pysqlite_cursor_init},
sqlite3.Node
{Py_tp_new, PyType_GenericNew},
sqlite3.Row
{Py_tp_new, pysqlite_row_new},
sqlite3.Statement
{Py_tp_new, PyType_GenericNew},
$ ./python
Python 3.10.0a7+
>>> import _sqlite3
>>> conn=_sqlite3.Connection(":memory:")
>>> type(conn)()
TypeError: function missing required argument 'database' (pos 1)
>>> conn2 = type(conn)(":memory:")
>>> list(conn.execute("SELECT 1;"))
[(1,)]
>>> list(conn2.execute("SELECT 1;"))
[(1,)]
New changeset 7dcf0f6db38b7ab6918608542d3a0c6da0a286af by Victor Stinner in branch 'master': bpo-43916: select.devpoll uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25751) https://github.com/python/cpython/commit/7dcf0f6db38b7ab6918608542d3a0c6da0a286af
Yes, the sqlite3 types are fine. I don't know why I included them in my earlier post.
I cleared the tp_new markings for the sqlite3 types on the Discourse post, since I can edit my posts over there, contrary to here on bpo :)
New changeset 665c7746fca180266b121183c2d5136c547006e0 by Victor Stinner in branch 'master': bpo-43916: _md5.md5 uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25753) https://github.com/python/cpython/commit/665c7746fca180266b121183c2d5136c547006e0
bpo-43916: _md5.md5 uses Py_TPFLAGS_DISALLOW_INSTANTIATION (GH-25753)
With this change, all tests listed in this issue should be fixed.
But I didn't check the whole list of https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403
Status in Python 3.9.
5 types allow instantiation whereas they must not: BUG!!!
Example of bug:
$ python3.9
Python 3.9.4
>>> import zlib
>>> obj=zlib.compressobj()
>>> obj2=type(obj)()
>>> obj2.copy()
Erreur de segmentation (core dumped)
The remaining types disallow instantiation with different ways.
Static type with tp_base=NULL and tp_new=NULL:
Heap type setting tp_new to NULL after type creation:
Static type setting tp_new to NULL after type creation:
Define a tp_new or tp_init function which always raise an exception:
Oops, sorry: in Python 3.9, tp_new is set to NULL after these heap types are created. There is no bug for these 3 types.
I checked " Types converted pre Python 3.9" and "Types converted in Python 3.9" of: https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403
(I didn't check "Types converted in Python 3.10" yet.)
These lists are not complete, for example select.kevent was not listed whereas it has a bug.
_struct.unpack_iterator can be modified to use Py_TPFLAGS_DISALLOW_INSTANTIATION: its tp_new method always raises an exception.
Bugs!
Need review:
Implement tp_new (ok!):
Other cases (ok!):
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['expert-C-API', 'type-bug', '3.9']
title = 'Check that new heap types cannot be created uninitialised: add Py_TPFLAGS_DISALLOW_INSTANTIATION type flag'
updated_at =
user = 'https://github.com/pablogsal'
```
bugs.python.org fields:
```python
activity =
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['C API']
creation =
creator = 'pablogsal'
dependencies = []
files = ['49989', '49992']
hgrepos = []
issue_num = 43916
keywords = ['patch']
message_count = 72.0
messages = ['391634', '391635', '391903', '391905', '391909', '391910', '391911', '391912', '391913', '391917', '391924', '391925', '391926', '391928', '391929', '391931', '391934', '391936', '391937', '391984', '391992', '391999', '392036', '392040', '392042', '392044', '392046', '392048', '392057', '392169', '392170', '392297', '392414', '392426', '392427', '392433', '392434', '392435', '392437', '392438', '392451', '392456', '392462', '392464', '392465', '392472', '392473', '392526', '392528', '392534', '392537', '392545', '392546', '392553', '392554', '392556', '392558', '392570', '392621', '392632', '392635', '392811', '394571', '394582', '394603', '396200', '396476', '411785', '411786', '411865', '411867', '411874']
nosy_count = 7.0
nosy_names = ['vstinner', 'christian.heimes', 'serhiy.storchaka', 'pablogsal', 'miss-islington', 'erlendaasland', 'shreyanavigyan']
pr_nums = ['25653', '25721', '25722', '25733', '25745', '25748', '25749', '25750', '25751', '25753', '25768', '25791', '25854', '26412', '26888']
priority = None
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue43916'
versions = ['Python 3.9']
```