thierry-martinez / pyml

OCaml bindings for Python
BSD 2-Clause "Simplified" License
187 stars 31 forks source link

Enhanced tracebacks with python 3.11 #84

Closed jamesjer closed 2 years ago

jamesjer commented 2 years ago

We are building all Fedora packages with the current beta release of python 3.11, to identify problems before the 3.11 release. The pyml package fails a test:

Starting tests...
Test 'version' ... Python version 3.11.0b4 (main, Jul 22 2022, 00:00:00) [GCC 12.1.1 20220628 (Red Hat 12.1.1-3)]
passed
Test 'library version' ... Python library version 3.11.0b4 (main, Jul 22 2022, 00:00:00) [GCC 12.1.1 20220628 (Red Hat 12.1.1-3)]
passed
Test 'hello world' ... passed
Test 'class' ... passed
Test 'empty tuple' ... passed
Test 'make tuple' ... passed
Test 'module get/set/remove' ... passed
Test 'capsule' ... passed
Test 'capsule-conversion-error' ... passed
Test 'exception' ... passed
Test 'ocaml exception' ... passed
Test 'ocaml exception with traceback' ... Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "file2.ml", line 2, in func2
  File "file1.ml", line 1, in func1
Exception: Great

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.11/traceback.py", line 353, in _walk_tb_with_full_positions
    positions = _get_code_position(tb.tb_frame.f_code, tb.tb_lasti)
  File "/usr/lib64/python3.11/traceback.py", line 367, in _get_code_position
    return next(itertools.islice(positions_gen, instruction_index // 2, None))
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 10, in <module>
  File "/usr/lib64/python3.11/traceback.py", line 74, in extract_tb
    return StackSummary._extract_from_extended_frame_gen(
  File "/usr/lib64/python3.11/traceback.py", line 416, in _extract_from_extended_frame_gen
    for f, (lineno, end_lineno, colno, end_colno) in frame_gen:
RuntimeError: generator raised StopIteration
raised an exception: File "pyml_tests.ml", line 181, characters 6-12: Assertion failed
Test 'restore with null' ... passed
Test 'ocaml other exception' ... passed
Test 'run file with filename' ... XXX lineno: 1, opcode: 151

The failure may be due to a change in the traceback format: https://docs.python.org/3.11/whatsnew/3.11.html#enhanced-error-locations-in-tracebacks.

jamesjer commented 2 years ago

Indeed, changing line 190 of pyml_tests.ml from:

        filenames = [f.filename for f in traceback.extract_tb(err.__traceback__)]

to:

        filenames = [f.filename for f in traceback.StackSummary.extract(traceback.walk_tb(err.__traceback__))]

gets the test to pass.

jamesjer commented 2 years ago

Some later tests were failing in weird ways. It turns out that the "ocaml other exception" test is to blame. When the OCaml exception is raised in that test, none of the python cleanup code is called. We are left with _Pyruntime.gilstate.tstate_current->cframe pointing to a frame on the stack. When the next test runs, it overwrites the frame structure with whatever it pushes onto the stack, leading to very weird failures down the road.

thierry-martinez commented 2 years ago

Thank you, @jamesjer, for your report and your analysis, and sorry for the delay! I think I finally fixed this in f682f97: Python 3.11 interpreter didn't like to be interrupted by an OCaml exception.

jamesjer commented 2 years ago

Great, I will give it a try. Thank you!

jamesjer commented 2 years ago

That commit works great. I did some comparisons with the python 3.11 header files, and wonder if any of these should be addressed as well:

thierry-martinez commented 2 years ago

Thank you very much for having carefully reviewing this! These differences should be fixed in 7fd6f0c . I hope to have tools for checking these kinds of things more systematically in a near future.