cython / cython

The most widely used Python to C compiler
https://cython.org
Apache License 2.0
9.03k stars 1.46k forks source link

[BUG] Large pyqt file causes compile to stall [Mac OS arm] #6202

Closed mchaniotakis closed 2 weeks ago

mchaniotakis commented 2 weeks ago

Describe the bug

My PyQt6 app is compiled with multiple files (.py) that rely on the standard python libraries (PyQt6, numpy, scipy, opencv etc..), all compile fine, except the MainWindow of the PyQt6 app, which is a large file of almost 10k lines (generated with PyQt6 ui to py (pyuic6 ) and a single class in a single .py file. The build hangs for at-least 2 hours on this file without any further output (even though all of the other files which might have up to 6k lines finish relatively quickly) .

My setup is: Macbook M1 Pro 16 inch Homebrew installed python 3.10 Cython 3.0.10 (Also tried version 0.29.37) PyQt6 6.6.1 PyQt6-Qt6 6.6.1 PyQt6-sip 13.6.0 OS: Monterey 12.5

Code to reproduce the behaviour:

    setup(
        name="myApp",
        version="X.X",
        packages=find_packages(),
        script_args=[
            "build_ext",
            "--inplace",
        ],
        ext_modules=cythonize(
            extensions, # list of .pyx files to cythonize made with Extension()
            build_dir="dest/folder",
            compiler_directives={"language_level": 3},
        ),
        zip_safe=False,
        annotate=True,
        verbose=True,
    )  

Expected behaviour

For the MainWindow to compile in a reasonable amount of time.

OS

MacOS

Python version

3.10.2

Cython version

3.0.10

Additional context

Latest cmd output: INFO: building 'myApp.module.submodule.MainWindowUi' extension INFO: clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -I/opt/homebrew/bin/myApp/include -I/opt/homebrew/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/include/python3.10 -c /Users/...../myApp/module/submodule/MainWindowUi.c -o build/temp.macosx-12-arm64-cpython-310/Users/..../compile_dir/myApp/module/submodule/MainWindowUi.o

scoder commented 2 weeks ago

How large is the generated C file submodule/MainWindowUi.c? What happens if you execute the above clang command line yourself?

scoder commented 2 weeks ago

BTW, have you validated that the code generated by PyQt actually benefits from compilation here?

mchaniotakis commented 2 weeks ago

Thanks for the quick reply. 1) The MainWindow.c generated file is 10.2MB. 2) If I run the command for the MainWindow.c file myself I get the same behavior (1.5 hours) 3) I have not measured performance for this file, but I chose to cythonize all modules indiscriminately for performance (if any) and code obfuscation, which might be useful in the future for certain scenarios, even though I am developing an open-source app. I forgot to mention that this is mac os specific problem, as it compiles fine on my Windows 11 x64 machine.

scoder commented 2 weeks ago

The MainWindow.c generated file is 10.2MB.

That's a fairly large file. No wonder the C compiler takes a while to compile it. It probably also needs a lot of memory along the way.

2. If I run the command for the MainWindow.c file myself I get the same behavior (1.5 hours)
   I forgot to mention that this is mac os specific problem, as it compiles fine on my Windows 11 x64 machine.

Using a different CPU type and a different C compiler with (probably) different compile flags. Especially the optimisation level and the enabled features can make a huge difference in compile time. The more module-wide a compiler applies its optimisations, the more it's prone to suffer from a large source file size.

3. I have not measured performance for this file, but I chose to cythonize all modules indiscriminately for performance (if any) and code obfuscation, which might be useful in the future for certain scenarios, even though I am developing an open-source app.

Code obfuscation has never been an important goal for us, and I doubt that PyQt frontend code is worth compiling to make it visibly faster.

I'll close this ticket as a duplicate of https://github.com/cython/cython/issues/4425 where we've discussed (and improved) the performance and source file size for large input files. Please try the Cython master branch (3.1) which should make at least some difference here.

mchaniotakis commented 2 weeks ago

Thanks for the feedback! I would like to just note here (in case it is ever relevant in the future) that my Windows PC is slower than my Mac, with the same amount of RAM and it takes around 5 minutes to compile on that machine versus the more than 2-3 hours on my mac (it might even be longer).

scoder commented 2 weeks ago

Are you using a native Aarch64 installation of clang or an emulated x86 one?

da-woods commented 2 weeks ago

That's a fairly similar size (both the .py file and the .c file) to ExprNodes which we compile as part of a full build of Cython itself, and which takes minutes rather than hours.

I haven't seen what the generated code looks like for these PyQt files. I'd guess it might be "one giant function" though which tend to make C compilers unhappy.

I tend to agree that there's not too much we can realistically do from Cython here.

mchaniotakis commented 1 week ago

Are you using a native Aarch64 installation of clang or an emulated x86 one?

file shows : Mach-O 64-bit bundle arm64 clang --version shows: Apple clang version 14.0.0 (clang-1400.0.29.202) Target: arm64-apple-darwin21.6.0 Thread model: posix

That's a fairly similar size (both the .py file and the .c file) to ExprNodes which we compile as part of a full build of Cython itself, and which takes minutes rather than hours.

I haven't seen what the generated code looks like for these PyQt files. I'd guess it might be "one giant function" though which tend to make C compilers unhappy.

I tend to agree that there's not too much we can realistically do from Cython here.

You are right, the whole file is the class with the setup ui method, so its just probably going really slow. I will leave it compiling for a long time and see if it eventually finishes :

from PyQt6 import QtCore, QtGui, QtWidgets
class Ui_MainWindow(object):
    def setupUi(self, MainWindow):
        MainWindow.setObjectName("MainWindow") # line 14
        ....
        QtCore.QMetaObject.connectSlotsByName(MainWindow) # line 8254

In any case, I will run the compilation on the M1 machine overnight and I will rerun the compilation on the windows machine and just post how long it took for every machine (along with their cpu's) here in case its needed in the future. For now I will except this MainWindow.py file and use it as .py, which will do for now. Thanks a lot.

scoder commented 1 week ago

the whole file is the class with the setup ui method

That means that almost all of this code runs exactly once, at import time. You'll see zero performance effect for this code at runtime. It really makes no sense to compile it for performance reasons. And Python code is orders of magnitude more efficient in terms of code size than the same functionality spelled out in a binary module.

I'd expect that there are also UI callback functions in that module but even there, it's unlikely that you'll have a measurable benefit from compiling them. Those tend to be very simple and small, usually delegating quickly to somewhere else where actual work is being done.