Open warrickball opened 1 year ago
With Py3.11.3 on windows it "works-for-me" :-)
(dev311) go|c:\srv> python -V
Python 3.11.3
(dev311) go|c:\srv> cd tmp
(dev311) go|c:\srv\tmp> mkdir pydeps191
(dev311) go|c:\srv\tmp> cd pydeps191
(dev311) go|c:\srv\tmp\pydeps191> git clone https://github.com/warrickball/tomso.git
Cloning into 'tomso'...
remote: Enumerating objects: 1839, done.
remote: Counting objects: 100% (271/271), done.
remote: Compressing objects: 100% (139/139), done.
remote: Total 1839 (delta 176), reused 197 (delta 123), pack-reused 1568
Receiving objects: 100% (1839/1839), 29.79 MiB | 11.23 MiB/s, done.
Resolving deltas: 100% (1201/1201), done.
(dev311) go|c:\srv\tmp\pydeps191> cd tomso
(dev311) go|c:\srv\tmp\pydeps191\tomso> pydeps tomso
c:\srv\lib\code\pydeps\pydeps\configs.py:108: UserWarning: Couldn't find a [pydeps] section in your config files 'c:\\srv\\tmp\\pydeps191\\tomso\\setup.cfg' -- or it was empty
warnings.warn(' '.join("""
(dev311) go|c:\srv\tmp\pydeps191\tomso>
The last lines of the traceback shows:
File "/home/wball/.local/lib/python3.11/site-packages/pydeps/mf27.py", line 75, in load_module
co = marshal.load(fp) # load marshalled code object.
^^^^^^^^^^^^^^^^
ValueError: bad marshal data (unknown type code)
which is calling python's standard marshal.load
on a .pyc
file... could it be that you have a .pyc
file generated from a different Python version laying around?
You can likely find the problem-file with the (undocumented) --debug-mf=2
option - be aware that it produces a significant amount of output...
(fwiw, the double-headed arrows indicate circular imports...)
Thanks for the quick reply! Here are the last few lines of the debug output before the traceback I posted before:
...
load_module -> Module(name=tty, file='/usr/lib64/python3.11/tty.py', path=None)
load_module(PKG_DIRECTORY) fqname=pydoc_data, fp=None, pathname=/usr/lib64/python3.11/pydoc_data
load_package 'pydoc_data' '/usr/lib64/python3.11/pydoc_data'
load_module(PY_SOURCE) fqname=pydoc_data, fp=fp, pathname=/usr/lib64/python3.11/pydoc_data/__init__.py
load_module -> Module(name=pydoc_data, file='/usr/lib64/python3.11/pydoc_data/__init__.py', path=['/usr/lib64/python3.11/pydoc_data'])
load_package -> Module(name=pydoc_data, file='/usr/lib64/python3.11/pydoc_data/__init__.py', path=['/usr/lib64/python3.11/pydoc_data'])
load_module -> Module(name=pydoc_data, file='/usr/lib64/python3.11/pydoc_data/__init__.py', path=['/usr/lib64/python3.11/pydoc_data'])
load_module(PY_COMPILED) fqname=pydoc_data.topics, fp=fp, pathname=/usr/lib64/python3.11/pydoc_data/topics.pyc
...
Indeed, pydoc_data.topics
fails if I try marshal.load
, so I'll see if I can follow up on that:
>>> import marshal
>>> f = open('/usr/lib64/python3.11/pydoc_data/topics.pyc', 'rb')
>>> marshal.load(f)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: bad marshal data (unknown type code)
It's installed as a system library which I might try reinstalling, although it looks like a core Python library (python3-libs
) so I'll wait until I don't have other stuff running (e.g. just before I shut down).
(fwiw, the double-headed arrows indicate circular imports...)
Yes, these are within some functions that convert from the objects in one class to the objects in another.
I'm trying to reverse engineer the .pyc
format but have so far noticed that if I skip 4 or 8 bytes, marshal.load
fails, whereas it succeeds (though perhaps not meaningfully) if I skip 12 bytes:
>>> import marshal
>>> f = open('/usr/lib64/python3.11/pydoc_data/topics.pyc', 'rb')
>>> f.read(12)
>>> marshal.load(f)
>>> f.close()
compared to
>>> import marshal
>>> f = open('/usr/lib64/python3.11/pydoc_data/topics.pyc', 'rb')
>>> f.read(8)
>>> marshal.load(f)
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: bad marshal data (unknown type code)
It should be the Python "magic number":
Python 3.11.3 (tags/v3.11.3:f3909b8, Apr 4 2023, 23:49:59) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib
>>> importlib.util.MAGIC_NUMBER.hex()
'a70d0d0a'
This path /usr/lib64/python3.11/pydoc_data/topics.pyc
is curious though, these days Python puts the .pyc
files in a __pycache__
directory so that different python versions can live side-by-side.
(dev311) go|C:\Program Files\Python311\Lib\pydoc_data\__pycache__> ll
total 1392
-rw-rw-rw- 1 bjorn 0 163 2023-05-18 13:46 __init__.cpython-311.opt-1.pyc
-rw-rw-rw- 1 bjorn 0 163 2023-05-18 13:46 __init__.cpython-311.opt-2.pyc
-rw-rw-rw- 1 bjorn 0 163 2023-05-18 13:46 __init__.cpython-311.pyc
-rw-rw-rw- 1 bjorn 0 469298 2023-05-18 13:46 topics.cpython-311.opt-1.pyc
-rw-rw-rw- 1 bjorn 0 469298 2023-05-18 13:46 topics.cpython-311.opt-2.pyc
-rw-rw-rw- 1 bjorn 0 469298 2023-05-18 13:46 topics.cpython-311.pyc
(dev311) go|C:\Program Files\Python311\Lib\pydoc_data\__pycache__>
This file is in the OS packages and is definitely a bit odd. There is also a __pycache__
subfolder:
$ ls /usr/lib64/python3.11/pydoc_data/__pycache__/
__init__.cpython-311.opt-1.pyc __init__.cpython-311.opt-2.pyc __init__.cpython-311.pyc
but it only contains the .pyc
files for __init__.py
. topics.pyc
is only distributed as a .pyc
: there is no source topics.py
file:
$ find /usr/lib64/python3.11/ -name topics.py
$
It's provided in the python3-libs
OS packages, which I tried re-installing but to no avail.
I'll raise this as an issue in the Fedora package. At a glance it looks like the only file provided as a .pyc
instead of source file at this level.
Yes, that sounds like a packaging issue, the topics.py file should definitely be present:
(dev311) go|C:\Program Files\Python311\Lib\pydoc_data> ll
total 764
-rw-rw-rw- 1 bjorn 0 0 2023-04-05 00:04 __init__.py
drw-rw-rw- 2 bjorn 0 4096 2023-05-18 13:46 __pycache__
-rw-rw-rw- 1 bjorn 0 1437 2023-04-05 00:04 _pydoc.css
-rw-rw-rw- 1 bjorn 0 770927 2023-04-05 00:04 topics.py
(dev311) go|C:\Program Files\Python311\Lib\pydoc_data>
topics.py starts with:
# -*- coding: utf-8 -*-
# Autogenerated by Sphinx on Tue Apr 4 23:22:02 2023
...
so maybe a sphinx step was omitted?
I've emailed the maintainers of the Fedora package. :package:
Hello. I am one of Fedora's Python maintainers.
topics and encodings are shipped as .pyc-only on purpose. The files are generated and we decided to only ship bytecode to save disk space. See https://src.fedoraproject.org/rpms/python3.11/c/740668aab7abe02f47d7a69e800c61b8b5e52f51
We have never encountered problems with that. It is a supported way to ship Python modules. What is the code here trying to do?
@hroncok The problem seems to be that the .pyc
file is not in the correct format (and also not in the expected __pycache__
folder). The code here is trying to read the .pyc
file and look for import-opcodes, but it is failing because the magic number in the .pyc
file is incorrect for the installed python version.
Yes, it's not in the pycache folder, but why would you think the magic number is incorrect? How do I quickly check the number from Python or shell to see if that's the case?
I see the comments above. Will debug the headers.
However note that I just got back from EuroPython and I am taking some time off computers.
@hroncok No worries, I'm on summer vacation myself :-)
>>> import pathlib, struct, marshal
>>> pyc1 = pathlib.Path('/usr/lib64/python3.11/encodings/cp1250.pyc')
>>> pyc2 = pathlib.Path('/usr/lib64/python3.11/encodings/__pycache__/cp1125.cpython-311.pyc')
>>> bytes1 = pyc1.read_bytes()
>>> bytes2 = pyc2.read_bytes()
>>> bytes1[:4]
b'\xa7\r\r\n'
>>> bytes2[:4]
b'\xa7\r\r\n'
>>> struct.unpack("<H2B", bytes1[:4])
(3495, 13, 10)
3495 is Python 3.11b4+, see importlib/_bootstrap_external.py
PEP 552 says:
The pyc header currently consists of 3 32-bit words. We will expand it to 4.
That's 16 bytes for Python 3.7+:
>>> marshal.loads(bytes1[16:])
<code object <module> at 0x7fb1a73f9a70, file "/usr/lib64/python3.11/encodings/cp1250.py", line 1>
>>> marshal.loads(bytes2[16:])
<code object <module> at 0x558a998382f0, file "/usr/lib64/python3.11/encodings/cp1125.py", line 1>
This is consistent with files in __pycache__
and with the specification. It also explains why when skipping 4 or 8 bytes, marshal.load fails, whereas it succeeds if we skip 16 bytes (as we should).
What seems to be the problem here?
I believe this comment is outdated:
The number of bytes at point 2 depends on the Python version.
The number is hardcocded here:
https://github.com/thebjorn/pydeps/blob/3c1c40b7198f58544c1d2fd10084de880176e544/pydeps/mf27.py#L74
If you care only for Pythons that are not yet end of life, changing this number to 12 should do the trick.
Looks like your analysis is correct (I'm still on vacation, so I haven't investigated why there isn't a problem on windows...).
The native modulefinder calls importlib._bootstrap_external._classify_pyc(data, fqname, {})
, but that seems to be a very private api.
https://github.com/thebjorn/pydeps/commit/23e0a17f4c1821c33c285e635c6837513c16b1a5 is a (WIP) version that accounts for the different sizes. How to test it isn't immediately obvious to my vacation brain, but I'm sure I'll figure it out soon ;-)
I'm running into this same problem using CentOS Stream 9 with python3.11
(which is some build of Python 3.11.4
).
I don't have this issue for personal projects that run in GitHub CI though (on ubuntu-latest
+ macos-latest
+ windows-latest
on Python minor versions 8-11).
Seems to corroborate that this is some RHEL thing.
(edit: I'm not caught up with the details on the thread, seems like there's a "why", just noting a +1 to problem bisection)
Yes, CentOS has the same pyc files.
v1.12.17 should have fixed .pyc header parsing code.
I just decided to try pydeps on a project of mine but it fails with the error (full error message at the end):
This is pydeps 1.12.12 installed via pip on Fedora 38 with Python 3.11.4 installed via system packages.
I tried a few things suggested in #181 to no avail. All the files are in the
tomso
folder and there is a__init__.py
file.Full error message
``` /home/wball/.local/lib/python3.11/site-packages/pydeps/configs.py:108: UserWarning: Couldn't find a [pydeps] section in your config files '/home/wball/try/pydeps/tomso/setup.cfg' -- or it was empty warnings.warn(' '.join(""" Traceback (most recent call last): File "/home/wball/.local/bin/pydeps", line 8, in