python / cpython

The Python programming language
https://www.python.org
Other
63.17k stars 30.25k forks source link

Segfault from `zoneinfo` with custom DateTime class #125318

Open robsdedude opened 1 week ago

robsdedude commented 1 week ago

Crash report

What happened?

Here's a minimal reproducer

import zoneinfo
from datetime import date

class MyDateTime:
    def __init__(
        self,
        year: int,
        month: int,
        day: int,
        hour: int = 0,
        minute: int = 0,
        second: int = 0,
    ) -> None:
        self.ordinal = date(year, month, day).toordinal()
        self.hour = hour
        self.minute = minute
        self.second = second

    def toordinal(self) -> int:
        return self.ordinal

def whoopsie():
    tz_berlin = zoneinfo.ZoneInfo("Europe/Berlin")
    dt = MyDateTime(1995, 9, 2, 14, 34, 56)
    print(tz_berlin.utcoffset(dt), flush=True)

If I run whoopsie, I get one of the following outcomes:

Current thread 0x00007fd62be47b80 (most recent call first): File "/segfault.py", line 27 in whoopsie File "/segfault.py", line 34 in



 I've tried running the test in a loop and it seems to always spit out the same result within a single run. So I assume some global state, memory layout, or or such is involved. 

### CPython versions tested on:

3.9, 3.13

### Operating systems tested on:

Linux

### Output from running 'python -VV' on the command line:

Python 3.13.0 (main, Oct 10 2024, 13:32:18) [GCC 11.4.0]

<!-- gh-linked-prs -->
### Linked PRs
* gh-125345
* gh-125824
<!-- /gh-linked-prs -->
ZeroIntensity commented 1 week ago

Hmm, I wasn't able to reproduce this on Linux, and Valgrind doesn't report any memory errors either. (That's what I get for prematurely applying version labels, heh.)

Could you run a debugger on your end (with the PYTHONMALLOC=malloc environment variable enabled)?

y5c4l3 commented 1 week ago

Confirmed on Python 3.13.0

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b46b51 in find_ttinfo (state=<optimized out>, self=self@entry=0x7ffff778b920, dt=0x7ffff77ccec0) at /usr/local/src/conda/python-3.13.0/Modules/_zoneinfo.c:2202
warning: 2202   /usr/local/src/conda/python-3.13.0/Modules/_zoneinfo.c: No such file or directory
(gdb) bt
#0  0x00007ffff7b46b51 in find_ttinfo (state=<optimized out>, self=self@entry=0x7ffff778b920, dt=0x7ffff77ccec0) at /usr/local/src/conda/python-3.13.0/Modules/_zoneinfo.c:2202
#1  0x00007ffff7b47009 in zoneinfo_ZoneInfo_utcoffset_impl (dt=<optimized out>, cls=0x555555c10cf0, self=0x7ffff778b920) at /usr/local/src/conda/python-3.13.0/Modules/_zoneinfo.c:554
#2  zoneinfo_ZoneInfo_utcoffset (self=0x7ffff778b920, cls=0x555555c10cf0, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.13.0/Modules/clinic/_zoneinfo.c.h:229
#3  0x00005555557741d2 in method_vectorcall_FASTCALL_KEYWORDS_METHOD (func=0x7ffff760a480, args=0x7ffff7b5d0f8, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.13.0/Objects/descrobject.c:380
#4  0x000055555570a8fe in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=0x7ffff760a480, tstate=0x555555aaf640 <_PyRuntime+282976>)
    at /usr/local/src/conda/python-3.13.0/Include/internal/pycore_call.h:168
#5  PyObject_Vectorcall (callable=0x7ffff760a480, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.13.0/Objects/call.c:327
#6  0x00005555555f2967 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.13.0/Python/generated_cases.c.h:813
#7  0x00005555557c3e51 in PyEval_EvalCode (co=co@entry=0x7ffff7764580, globals=globals@entry=0x7ffff772ff80, locals=locals@entry=0x7ffff772ff80) at /usr/local/src/conda/python-3.13.0/Python/ceval.c:596
#8  0x00005555557e8ad0 in run_eval_code_obj (tstate=tstate@entry=0x555555aaf640 <_PyRuntime+282976>, co=co@entry=0x7ffff7764580, globals=globals@entry=0x7ffff772ff80, locals=locals@entry=0x7ffff772ff80)
    at /usr/local/src/conda/python-3.13.0/Python/pythonrun.c:1323
#9  0x00005555557e3b2c in run_mod (mod=mod@entry=0x555555be7148, filename=filename@entry=0x7ffff7746330, globals=globals@entry=0x7ffff772ff80, locals=locals@entry=0x7ffff772ff80, flags=flags@entry=0x7fffffffdb78,
    arena=arena@entry=0x7ffff7b9fbf0, interactive_src=0x0, generate_new_source=0) at /usr/local/src/conda/python-3.13.0/Python/pythonrun.c:1408
#10 0x00005555558028c6 in pyrun_file (fp=fp@entry=0x555555b401e0, filename=filename@entry=0x7ffff7746330, start=start@entry=257, globals=globals@entry=0x7ffff772ff80, locals=locals@entry=0x7ffff772ff80, closeit=closeit@entry=1,
    flags=0x7fffffffdb78) at /usr/local/src/conda/python-3.13.0/Python/pythonrun.c:1241
#11 0x0000555555801358 in _PyRun_SimpleFileObject (fp=fp@entry=0x555555b401e0, filename=filename@entry=0x7ffff7746330, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffdb78) at /usr/local/src/conda/python-3.13.0/Python/pythonrun.c:490
#12 0x00005555558010ec in _PyRun_AnyFileObject (fp=0x555555b401e0, filename=filename@entry=0x7ffff7746330, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffdb78) at /usr/local/src/conda/python-3.13.0/Python/pythonrun.c:77
#13 0x00005555557fabfe in pymain_run_file_obj (skip_source_first_line=0, filename=0x7ffff7746330, program_name=0x7ffff776a010) at /usr/local/src/conda/python-3.13.0/Modules/main.c:409
#14 pymain_run_file (config=0x555555a81d38 <_PyRuntime+96344>) at /usr/local/src/conda/python-3.13.0/Modules/main.c:428
#15 pymain_run_python (exitcode=0x7fffffffdb6c) at /usr/local/src/conda/python-3.13.0/Modules/main.c:696
#16 Py_RunMain () at /usr/local/src/conda/python-3.13.0/Modules/main.c:775
#17 0x00005555557b3d57 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.13.0/Modules/main.c:829
#18 0x00007ffff7cabe08 in ?? () from /usr/lib/libc.so.6
#19 0x00007ffff7cabecc in __libc_start_main () from /usr/lib/libc.so.6
#20 0x00005555557b314e in _start ()
ZeroIntensity commented 1 week ago

Oh wait, I think I tested it on main, not 3.13. Though, 3.12 didn't reproduce it either. It's rare for bugs to be specific to one version, because things typically get backported, but I guess it's possible.

y5c4l3 commented 1 week ago

Oh wait, I think I tested it on main, not 3.13. Though, 3.12 didn't reproduce it either. It's rare for bugs to be specific to one version, because things typically get backported, but I guess it's possible.

Can even confirm the bug up to Python 3.10 due to out-of-bounds read of PyDateTime_DateTime->fold on non-datetime types. That explains why it behaves randomly.

I really doubt whether it's a proper use case of passing custom toordinal types. Despite of being undocumented, it's already here and there.

I could make a draft PR on this.

robsdedude commented 1 week ago

My 2 cents: I very much dislike how many of the temporal type related things (e.g. datetime.timezone insist on only working with objects of datetime types. Why isn't duck typing a thing here as in many other places in Python?

Use case: I'm the maintainer of the neo4j Python driver. Neo4j supports temporal types with nanosecond precision. So until Python ships with this feature (there's an open issue and wip PR somewhere), we ship custom temporal types. Since pytz mostly works with duck typing, but zoneinfo and datetime.timezone don't, we have little choice but to rely on pytz only.

Why not inherit from datetime types? Because the types work a little different and have for instance different signatures for instantiation but can still provide the required information needed for timezone computation (e.g., comparison, arithmetics, toordinal).

I'd much prefer to just get an AttributeError telling me that fold is missing over an inheritance check (or a segfault 😉).

EDIT: looking at the PR I might have misunderstood what you meant by adding a typecheck while mentioning that you couldn't imagine a use-case for calling this with an object not inheriting from datetime.datetime. The PR looks like it would still work with such an object (but ignoring its fold and assuming 0?).

ZeroIntensity commented 1 week ago

Oh wait, I figured out why I couldn't reproduce this before. The given reproducer does not call whoopsie(), so the script compiles and finishes just fine--silly me! Confirmed on all bugfix versions, thank you for the report!

hacscred commented 1 week ago

How is this not a HOWTO to attack every Django app in the world?

y5c4l3 commented 3 days ago

As the contract for tzinfo base class is not very clear and by the docstring in pure Python zoneinfo here, I would assume these methods should only accept datetime types. However, there isn't pretty much typecheck in both C zoneinfo and pure Python one.

https://github.com/python/cpython/blob/0cd21406bf84b3b4927a8117024232774823aee0/Lib/_pydatetime.py#L1290-L1317

They only check the types at fromutc(dt). https://github.com/python/cpython/blob/0cd21406bf84b3b4927a8117024232774823aee0/Modules/_zoneinfo.c#L615-L618 https://github.com/python/cpython/blob/0cd21406bf84b3b4927a8117024232774823aee0/Lib/zoneinfo/_zoneinfo.py#L114-L118

It turns out that the pure Python one could be leveraged safely by duck types:

https://github.com/python/cpython/blob/0cd21406bf84b3b4927a8117024232774823aee0/Lib/zoneinfo/_zoneinfo.py#L158-L177

y5c4l3 commented 3 days ago

@robsdedude The PR is a draft and only means to be some attempt (could be bad in theory and it is). Honestly I didn't find it interesting and correct to set fold = 0 and derive hacky ymd from toordinal(), which is only a part of the datetime "contract".

If possible, we should:

Since the missing typecheck causes out-of-bounds read now, I think the best bet here is to add type checks first.

@ZeroIntensity What would you think? Or shall we cc and discuss with datetime code owners instead?

ZeroIntensity commented 3 days ago

@pganssle As a datetime expert, what do you think?

pganssle commented 3 days ago

I'm not clear on what is causing the segfault here, but I think regardless of the contract of Zoneinfo, this needs to be fixed. It should not be possible to crash the interpreter using pure Python code.

Or am I missing something, and my response is already the assumption? Is the question about whether this should raise an exception vs. doing something else?

ZeroIntensity commented 3 days ago

Is the question about whether this should raise an exception vs. doing something else?

Yeah, from my understanding it's about whether or not to try to ducktype a datetime class or just thrown a exception if it doesn't inherit.

y5c4l3 commented 3 days ago

I'm not clear on what is causing the segfault here

FYI PyDateTime_DATE_GET_FOLD on a non PyDateTime dt https://github.com/python/cpython/blob/5989eb74463c26780632f17f221d6bf4c9372a01/Modules/_zoneinfo.c#L2180-L2199

ZeroIntensity commented 3 days ago

IMO, duck-typing C code is a PITA, and we shouldn't encourage it. But, I don't know what the convention is for datetime--if we're supporting duck-typing already, then it makes sense to do it here as well.

pganssle commented 3 days ago

Ah, sorry, I was on the phone and didn't really follow the context here. Thanks for the extra information.

I think for this particular issue, we need to backport a fix (possibly even to versions receiving security-only-updates), so at least for the backports, we can't do anything even remotely breaking except in cases that were already broken. If enforcing that ZoneInfo accept only datetime subclasses would break use cases that were working before and not causing segfaults, we have to maintain that.

For 3.13+, we can take a more principled stance. I'm kinda ambivalent about it. I am generally in favor of duck typing, but it is hard to detect that something actually matches the protocol — especially since we can't really declare easily that the semantic meaning of a method on a duck-typed "datetime" matches the semantics we are expecting. For example Pendulum is even a subclass of datetime, but they change the semantics of addition and subtraction, so we may be making invalid assumptions about how those work.

That said, at the end of the day, bugs introduced by "false friend" classes are basically bugs for those classes, and that's really more an argument for not creating duck-typed datetime-like classes than it is for not supporting them.

Assuming we do go with a mechanism that supports duck-typing, we will almost certainly want to do something similar to what we're doing with subclass constructors, where we have a "fast path" for datetime subclasses, and then a slow path that goes through the Python interface for anything else.

For the backport only, I would say to just try to detect the situation where a segfault is going to happen and raise an exception.

y5c4l3 commented 2 days ago

125824 Here is the backport patch. I don't think the segfault could be precisely predicted. In favor of soundness / completeness, a type check is necessary, but would break existing usecases. The patch I made now allows only 1 step of out-of-bounds read (i.e. out-of-bounds value fold must be either 0 or 1).

I've written a test that makes the out-of-bounds read more reproducible, but address sanitizer catches the UAF in my test: https://github.com/python/cpython/actions/runs/11452302563/job/31862970634

Would you think it's reasonable to skip the test when address sanitizer is enabled?