colesbury / nogil

Multithreaded Python without the GIL
Other
2.91k stars 107 forks source link

Installing PyQt6 on macOS #100

Open jimkring opened 1 year ago

jimkring commented 1 year ago

Hi. I'm trying to install PyQt6 on nogil. I have no idea if it should work or not.

It gets stuck at Preparing metadata (pyproject.toml) ...

$ python -m pip install PyQt6
Collecting PyQt6
  Using cached PyQt6-6.4.0.tar.gz (1.0 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... -

I installed qt6 with homebrew and it appears to accessible in the PATH -- I can run qmake just fine.

Thanks for any ideas. Looking forward to exploring nogil!

Update

Using the --verbose flag showed that it was hanging up while prompting for agreement to the license terms.

I followed the notes here and used the following command to automatically accept the license during the pip install:

python -m pip install --config-settings --confirm-license= --verbose pyqt6

(and do note the "=" after "--confirm-license=" is required)]

jimkring commented 1 year ago

PS: @colesbury - thanks for all your work on nogil and I'm very excited to see PEP 703!

colesbury commented 1 year ago

Hi @jimkring, I'll look into build a PyQT6 wheel for macOS.

jimkring commented 1 year ago

I noticed that I get similar issues when I try to install numpy:

$ python -m pip install NumPy
Collecting NumPy
  Using cached numpy-1.24.1.tar.gz (10.9 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: NumPy
  Building wheel for NumPy (pyproject.toml) ... -

Update: I think that numpy was just taking a while. Also, I was able to install/build it from source using this command: pip install git+https://github.com/colesbury/numpy.git@v1.24.0-nogil

jimkring commented 1 year ago

Update: Using the --verbose flag showed that it was hanging up while prompting for agreement to the license terms. I followed the notes here and used the following command to automatically accept the license during the pip install:

python -m pip install --config-settings --confirm-license= --verbose pyqt6

(note the "=" after "--confirm-license=" is required)]

colesbury commented 1 year ago

Hi @jimkring, I think something strange is going on. You should be able to just run "pip install numpy" and get the pre-built binary wheel for NumPy 1.24.0 on macOS. You should not need to install it from source.

Can you provide details about your installation:

jimkring commented 1 year ago

Hi @colesbury.

I'm on an Intel mac.

I installed nogil via pyenv install nogil-3.9.10.

I then created a virtual environment via /Users/jim/.pyenv/versions/nogil-3.9.10/bin/python -m venv .venv-no-gil.

Then, after activating the venv, I think I probably updated pip (out of habit):

python -m pip install --upgrade pip

Here's the output of pip list

(.venv-no-gil) MacBook-Pro:pyside-setup jim$ pip list
Package           Version
----------------- ----------------------
async-generator   1.10
attrs             22.2.0
exceptiongroup    1.1.0
idna              3.4
iniconfig         2.0.0
numpy             0.3.0+30167.g63723dfcb
outcome           1.2.0
packaging         23.0
pip               22.3.1
pluggy            1.0.0
pydantic          1.10.4
PyQt6             6.4.0
PyQt6-sip         13.4.0
pytest            7.2.1
pytest-asyncio    0.20.3
pytest-trio       0.8.0
setuptools        65.5.1
sniffio           1.3.0
sortedcontainers  2.4.0
tomli             2.0.1
trio              0.22.0
typing_extensions 4.4.0
wheel             0.38.4 
jimkring commented 1 year ago

I should also mention:

I've been able to successfull build+install pyqt6. That's good news.

However, I then realized that PyQt6 does not include the Signal and Slot modules that I need (from PySide6). So, I tried building PySide6 from source and I've run into new error, when it tries to build libshiboken.

Here's how I did that (and let me know if you'd like me to open a different issue for the pyside/libshiboken build issue).

Pull a copy of the pyside sources git clone https://code.qt.io/cgit/pyside/pyside-setup.git

Switch to the version/branch we want to build git checkout 6.4.2

Start the build python setup.py install --qtpaths=/usr/local/opt/qt/bin/qtpaths6 --parallel=8 --build-tests

[57/503] Building CXX object libshiboken/CMakeFiles/libshiboken.dir/helper.cpp.o
FAILED: libshiboken/CMakeFiles/libshiboken.dir/helper.cpp.o 
/Library/Developer/CommandLineTools/usr/bin/c++ -DBUILD_LIBSHIBOKEN -DHAVE_NUMPY -DNDEBUG -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -DPy_LIMITED_API=0x03060000 -I/Users/jimkring/JKI/GitHub/FlowRailsIDE/pyside-setup/sources/shiboken6/libshiboken -I/Users/jimkring/JKI/GitHub/FlowRailsIDE/pyside-setup/build/.venv-no-gil/build/shiboken6/libshiboken -I/Users/jimkring/JKI/GitHub/FlowRailsIDE/.venv-no-gil/lib/python3.9/site-packages/numpy/core/include -I/Users/jimkring/.pyenv/versions/nogil-3.9.10/include/python3.9 -Wall -Wextra -Wno-strict-aliasing -fvisibility=hidden -D QT_NO_CAST_FROM_ASCII -D QT_NO_CAST_TO_ASCII -O3 -DNDEBUG -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -mmacosx-version-min=12.6 -fPIC   -fPIC -std=gnu++17 -MD -MT libshiboken/CMakeFiles/libshiboken.dir/helper.cpp.o -MF libshiboken/CMakeFiles/libshiboken.dir/helper.cpp.o.d -o libshiboken/CMakeFiles/libshiboken.dir/helper.cpp.o -c /Users/jimkring/JKI/GitHub/FlowRailsIDE/pyside-setup/sources/shiboken6/libshiboken/helper.cpp
/Users/jimkring/JKI/GitHub/FlowRailsIDE/pyside-setup/sources/shiboken6/libshiboken/helper.cpp:211:30: error: no member named 'ob_refcnt' in '_object'
    str << ", refs=" << obj->ob_refcnt << ", ";
                        ~~~  ^
1 error generated.

We can see this error error: no member named 'ob_refcnt' in '_object' -- it's worth noting that I see a similar error with other tools that try to work with the cpython sources like this error when I try to build an exe using nuitka within python nogil.

Thanks for looking into this. I'm excited to start testing out nogil (and encouraging others to do so).

colesbury commented 1 year ago

I've been able to successfull build+install pyqt6. That's good news.

Great! I got it to build locally too after some challenges with homebrew. I'll see if it's not too much work to build binary wheels once we figure out the other issues.

So, I tried building PySide6 from source and I've run into new error, when it tries to build libshiboken.

This looks like it needs to use Py_REFCNT(obj) instead of obj->ob_refcnt. The reference count field changed (it's now two fields in "nogil Python), so you can't access them directly. The Python docs now say to use Py_REFCNT instead of ->ob_refcnt so maybe we can get this change upstreamed.

It also sounds like there is a similar issue in nuitka. Is nuitka a dependency of pyside6?

let me know if you'd like me to open a different issue for the pyside/libshiboken build issue

Let's wait until we figure out any issues on the nogil side.

jimkring commented 1 year ago

Hi @colesbury some answers and updates...

It also sounds like there is a similar issue in nuitka. Is nuitka a dependency of pyside6?

No, nuitka is not related to pyside6 -- nuitka is a python compiler for building stand-alone executables.

Regarding the pyside/libshiboken build issue, I did a search in the code for use of obj->ob_refcnt and tried replacing with Py_REFCNT(obj). This is a bit tricky since there's a lot of dynamic C++ code generation happening, and many fixes end up looking like the following:

before:

    for (int index : std::as_const(invalidateArgs)) {
        s << "bool invalidateArg" << index << " = PyTuple_GET_ITEM(" << PYTHON_ARGS
            << ", " << index - 1 << ")->ob_refcnt == 1;\n";
    }

after:

    for (int index : std::as_const(invalidateArgs)) {
        s << "bool invalidateArg" << index << " = Py_REFCNT(PyTuple_GET_ITEM(" << PYTHON_ARGS
            << ", " << index - 1 << ")) == 1;\n";
    }

Also, I looked at what it would take to upstream such fixes to PySide6 and the process described on the Qt Contribution Guidelines is a little daunting to me.

I have a pyside6 build in progress and it takes a long time... we'll see how it goes.

jimkring commented 1 year ago

I've gotten farther -- I can now build/install pyside6. This runs to completion:

python setup.py install --qtpaths=/usr/local/opt/qt/bin/qtpaths6 --parallel=8 --build-type=pyside6 --reuse-build --skip-modules=WebEngineCore,WebEngineWidgets --no-qt-tools

However, I get a segfault, even with a simple hello world example :(

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
ctismer commented 1 year ago

@jimkring I will have a deeper look into nogil on PySide.

jimkring commented 1 year ago

Thanks @ctismer!

jimkring commented 1 year ago

@ctismer -- I just noticed you're on the Qt team. I'm sure you can figure this all out much faster than I could ever, and... I figured I'd share with you the changes I made to the pyside6 code to get it to build. Here is a PR with the changes in a fork I made of pyside-setup -> https://github.com/jimkring/pyside-nogil/pull/1 (I'm working in the nogil branch)

ctismer commented 1 year ago

Yes folks, I've been hacking away at PySide for 7 years now and even managed to get a PyPy build working. So I thought it would be a good thing to jump on the NoGil wagon as the next big thing for PySide. PyPy really pushed me to my limit and beyond in that regard, although the response to it was pretty unsatisfactory. I think the effort-to-benefit ratio should be much better with NoGil.

Thanks for your patch @jimkring , I'll get right on it :-)

ctismer commented 1 year ago

However, I then realized that PyQt6 does not include the Signal and Slot modules that I need (from PySide6)

FYI: The equivalent to Signal and Slot in PyQt are pyqtSignal and pyqtSlot.

Python 3.10.9 (main, Dec 15 2022, 18:18:30) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

But now warming up the VS-Code debugger, quite helpful to look at a debug build. BTW: Instead of NoGIL, I would call this the Gillespie branch...

jimkring commented 1 year ago

@ctismer it’s nice to learn more of the history of pyside. Thanks for that info (and tips). I am certain NoGil is going to be a huge benefit to all, so thanks in advance for supporting that. It’s a fun place to explore, for sure. Cheers.

ctismer commented 1 year ago

@colesbury I see quite different opcodes - even the convention of 2 byte per opcode was not used. This will become a quite special hack :) Do you intend to clean this all up, or will the final Py 3.12 (?) nogil be written from scratch? Another, real question: Is the sys.flags.nogil the only way to know who I am, or is there maybe a compile time info?

colesbury commented 1 year ago

@ctismer Does PyQt depend on the byte code format? The byte code changes will probably not be ported to 3.12.

In C code, you can use the Py_NOGIL define to check for the nogil build.

ctismer commented 1 year ago

@colesbury Ah, fine! I don't think PyQt depends on byte code. I use a special trick to support the old enum syntax of Qt, therefore the hack. But it's just a little introspection, and the feature can vanish, soon.

Thanks for your help!

jimkring commented 1 year ago

Hi @colesbury and @ctismer,

Here's a nice update -> I've been able to get my application to run on NoGIL with PyQt6!

(Christian, your the tip about where to find pyqtSignal and pyqtSlot was very helpful, so thanks again. I'm still VERY interested in getting PyQt6 running in NoGIL, so I'm still very eager to help test PySide6 on NoGIL and don't want my my "good news" to slow us down :)

For what it's worth, Sam, my application seems to be much snappier running in NoGIL. It uses trio for concurrency -- I used the asyncio examples on doc.qt.io/qtforpython as a starting point, which has worked well (Christian, perhaps you helped write those? Thanks!).

I'm now going to look into how best to move from asyncio over to threads for concurrency+parallelism and see how that goes.

ctismer commented 1 year ago

Your app is running? But there must be lots of bugs in the test suite, at least I get quite a lot in PySide. I guess when you really use parallelism, the problems will begin.

ctismer commented 1 year ago

@colesbury We have some unit tests which check the refcounts. Some of them fail, they are off by one. The tests use no threads. Is that to be expected?

Also, about 1/5th of unit tests segfault at the end. How can that be without threading?

I still have to test with setting the environment flag, tomorrow.

jimkring commented 1 year ago

@ctismer my unit tests probably aren't as extensive as yours and perhaps I have bugs in the form of race conditions that are yet to be observed :)

I'm a parallel programmer by trade, and I'm excited to help Python become more parallel.

colesbury commented 1 year ago

We have some unit tests which check the refcounts. Some of them fail, they are off by one. The tests use no threads. Is that to be expected?

Yes. Some objects are "immortal", including small integers. sys.getrefcount lies and says the refcount is 999 (chosen because it stand out more than 2^32 or something). Depending on the arithmetic performed in the test, you can end up with off-by-one errors. (I've seen similar issues in some NumPy refcounting tests.)

For what it's worth, "immortal objects" is probably going to land in Python 3.12 completely independently of the nogil changes (and with a different implementation.)

Also, about 1/5th of unit tests segfault at the end. How can that be without threading?

I don't know without more information. If you can get a stack trace from gdb, that would help.

ctismer commented 1 year ago

@colesbury Hi Sam, just tried it: Setting PYTHONGIL=1 does not change it, I still have weird behavior on shutdown. Maybe it is something that we are doing wrong since a long time - no idea. We don't have a problem on CPython and not on PyPy, but who knows :) I will send you dump output later (after our weekly meeting). Cheers -- Chris

ctismer commented 1 year ago

python3.9-2023-02-09-113333.txt This was sources//pyside6/tests/signals/lambda_test.py lambda_test.py.txt

colesbury commented 1 year ago

Hi @ctismer, I think the problem is that signalmanager.cpp is calling into Python without trying to acquire the GIL:

https://github.com/qtproject/pyside-pyside-setup/blob/2379fbd9f10255bba9ad3caaa8ccea17fc9cdfbf/sources/pyside6/libpyside/signalmanager.cpp#L659-L662

Although, the whole point of the project is to remove the GIL, it's still important to call those same functions to attach/detach the thread from the VM. The functions are still important, they just don't prevent multiple threads from running concurrently. (Here's the section of the PEP that explains it a bit more.)

I don't think calling the PyDict functions without first acquiring the GIL is intended to be supported upstream either.

ctismer commented 1 year ago

@colesbury Whow, thank you! How could you find that with that dump? For some reason, it was quite unreadable for me. Repeated this now and got a better readable one, which in fact tells me what you said. Did you try this, yourself? python3.9-2023-02-09-143507.ips.txt Checking signal manager protection now. Makes sense.

Yeah, it works for this example. Running all tests now. I need to find a finer way to protect this signal mess, because we get into the "5ms-carousel" by using the GIL often.

Yes, only 7 out of 616 tests fail - that is great!

colesbury commented 1 year ago

@ctismer Great! A lot of the same issues crop up repeatedly both during development of the "nogil" fork and in extensions. This one, where _Py_current_tstate is NULL, I've seen mostly during development of the fork itself. (The _Py_current_tstate and PyErr_Occurred in the dump were helpful clues.)

I'll start maintaining a list of common issues and how they appear. Maybe that will help with future extensions.

EDIT: Started documenting common issues on the wiki: https://github.com/colesbury/nogil/wiki/Common-issues-with-extensions

ctismer commented 1 year ago

BTW.: Do you know the GIL-carousal effect? The reason to use no GIL with signals? We struggled with that very much in 2020. Releasing the GIL often makes your app slower. But I think/hope that problem gets away by the use of NoGIL. https://bugreports.qt.io/browse/PYSIDE-803

colesbury commented 1 year ago

@ctismer I hadn't heard of the GIL carousel effect, but the linked bug report explains it well. I've encountered performance issues due to releasing the GIL, mostly due to the overhead of PyEval_Save/RestoreThread being large compared to the actual work performed.

Both those issues should be greatly reduced with the nogil fork: the overhead of PyEval_SaveThread is a lot smaller than in upstream CPython and it doesn't have the 5ms handoff (unless you run with the PYTHONGIL=1 compatibility mode).

ctismer commented 1 year ago

@colesbury Your hint was great. The remaining bugs were introduced by the change to the PyCFunctionObject, which grew a new field with an unfortunate offset. After fixing that, there are only 2 bugs left:

The python_thread test is flaky. It seems that something needs to be locked, now. The cyclic test is unclear, yet. It fiddles with garbage collection.

I might stop this now, since the basic functionality is there. Making things really thread-safe is maybe too time-consuming with a prototype.

colesbury commented 1 year ago

@ctismer Do you have a fork with changes? Can you point me at it? I'd like to take a look at the threading issues.

ctismer commented 1 year ago

@colesbury Yes, I will create a branch on github, tomorrow.

ctismer commented 1 year ago

@colesbury Hi Sam, you find a clone of our dev repo at https://github.com/ctismer/pyside6-dev. . There is a branch dev-nogil-patches.

This repo needs the Qt 6.4.2 branch. I am building with the following setting:

export PATH=/Users/tismer/Qt/6.4.2/macos/bin:$PATH
/Users/tismer/.pyenv/versions/nogil-3.9.10-1-debug/bin/python3 setup.py install --debug --limited-api=no \
    --reuse-build --build-tests --module-subset Core,Gui,Widgets --skip-docs && \
    python3 testrunner.py test > log3.log 2>&1

This should work with all but the two mentioned tests:

Oh, and I saw the nogil-3.12 branch. Should I already switch to it?

colesbury commented 1 year ago

@ctismer - thanks! I'm able to reproduce the issues. The nogil-3.12 branch isn't ready yet. Here's what I've found so far:

sources/shiboken6/tests/samplebinding/python_thread_test.py

The bucket list was previously protected from concurrent access by the GIL. Without the GIL, it needs protection by a mutex (e.g., https://github.com/colesbury/pyside6-dev/commit/d71fb16537c57273199df042e5d4565eb635bbc1)

sources/shiboken6/tests/samplebinding/cyclic_test.py

This isn't related to the GIL, at least not directly (there aren't any threads in this test). It has to do with the order of destruction of cyclic trash in the garbage collector. The immediate cause is that the child C++ ObjectType gets deleted twice. First it's deleted directly, then the parent deletes it when destroying m_children. I'm still investigating this. I'm not sure why this happens -- there seems to be a bunch of code in basewrapper.cpp to prevent this situation.

For context, the order of destruction of cyclic trash is sometimes different than in upstream. In upstream CPython, it's mostly deterministic (usually in older objects are destroyed first). In nogil Python, objects are found scanning the GC heap, and a lot of things can affect the order. (I think this is why the test only crashes sometimes.)

Mostly for my own future reference, here's the steps I took to build pyside6 (on macOS 13.1, ARM64):

  1. Install nogil Python
  2. brew install qt (installed qt 6.4.2)
  3. brew install llvm@13
  4. brew link llvm@13
  5. pip install --upgrade cmake ninja packaging setuptools
colesbury commented 1 year ago

@ctismer - here's a possible patch to address the cyclic_test.py issue:

https://github.com/colesbury/pyside6-dev/commit/090912ae2124ff5475d31b011e18dc67860e64a2

For context, you can trigger a similar crash in upstream CPython if you change the order in which CyclicChildObject and CyclicObject are constructed so that CyclicChildObject is finalized first. Here's an example of what I mean: https://github.com/colesbury/pyside6-dev/commit/8a8e88dfe5df9d7241e101296e4464b46b4e47f6.

ctismer commented 1 year ago

@colesbury Great, I will continue tomorrw. Are you aiming at inclusion into 3.12, or is it more realistic to expect multiple interpreters and GILs in 3.12?

colesbury commented 1 year ago

@ctismer - I forget exactly when the 3.12 feature freeze is, but at this point I don't expect any major GIL-related changes to make it in to 3.12.

stonebig commented 1 year ago

@ctismer - I forget exactly when the 3.12 feature freeze is, but at this point I don't expect any major GIL-related changes to make it in to 3.12.

Freeze is May 8th

ctismer commented 1 year ago

@colesbury Hi Sam, I have created a pull request and merged the changes back into my Gerrit branch. I'm currently testing it with normal Python, too.

The fix for the cyclic test is great, thank you! But: why is the fix necessary? Because something in nogil is deleted in different order. It would be nicer to avoid this. Not even PyPy has problems with this, so it respects the order. Is it possible to avoid this? Or is it principled that there is a non-deterministic order of destructors?

ctismer commented 1 year ago

@colesbury Ok, I checked the patches also with normal Python, all great. Because of the different handling of pull-requests in gerrit, I had to recreate the branch dev-nogil-patches. This is now the version that I will try to put into the main repo, tomorrow.

In a sense, we can claim now

modulo the many missing NoGIL tests πŸ˜†

colesbury commented 1 year ago

@ctismer It'll be great to get these changes merged into the main repo.

In general, I'd like to preserve the behavior of CPython, but in this case, I don't how to, at least not without substantial performance penalties. It might be possible to do something narrow that would avoid the issue in cyclic test, but I'm not sure if that's worthwhile. What I mean is that there is the order in which tp_finalize, tp_clear and tp_dealloc are each called and objects may be finalized/cleared/deallocated in different orders. The cyclic test is only sensitive to the tp_dealloc order.

PyPy has a pretty different GC implementation than CPython, and at least in some simple tests, can finalize objects in a different order than CPython. I'm not sure why the cyclic test doesn't have issues with PyPy. Maybe the tp_dealloc order is the same in this case as in CPython. I tried investigating this, but wasn't able to build pyside6 for PyPy successfully. (My --limited-api=yes build had compile errors due to duplicate definitions and the --limited-api=no build segfaulted during builds.)

ctismer commented 1 year ago

@colesbury Ok, nice that I may merge this into the main repo (hope the'll take it).

Ok, I take it that it was a bug in our code that just worked, but needed to be made more stable against changes in the calling sequence.

On PyPy: Well, I forgot to mention that PyPy can only be built with --limited-api=no. With the current status of PEP 384ff, you cannot build PyPy because the incompatibilities go much deeper. There is a better API in the works that would work for PyPy, possibly. (Forgot which one πŸ˜„ ) HPy https://hpyproject.org/ but I don't know the status.

That it segfaulted during build is probably an overlap since you built two versions of Python 3.9 (maybe). You should try a clean build, then things should work. You can run the tests also with a build, only. The overlap comes from a site-packages entry that is not unique.

jimkring commented 1 year ago

Hi @colesbury and @ctismer. I'm still getting a Segmentation fault: 11 when running PySide6 (from your branch here).

it appears to be happening in the constructor of an object which is a subclass of QGraphicsRectItem -- the crash happens right before the __init__ constructor returns when I call this function self.indicator_text = QGraphicsTextItem(parent=self) -- self in this case is an instance of QGraphicsRectItem.

Note: I'm testing with nogil 3.9.10-1 and qt 6.4.2. Also: I'm able to run with PyQt6 (installed with python -m pip install pyqt6) just fine.

I was able to pinpoint the crash location using a combination of python -X faulthandler myapp.py to output the trace and then putting print statements to isolate the crash. I'm not extremely well versed in using debuggers in Python (pycharm complained about opcodes when I tried to run with a debugger in nogil) so I'm not sure how best to help.

ctismer commented 1 year ago

Hi @jimkring can you please give me some code? Simply creating an QGraphicsTextItem object works for me:

$ /Users/tismer/.pyenv/versions/nogil-3.9.10-1-debug/bin/python3 
Python 3.9.10 (main, Feb  6 2023, 15:50:18)
[nogil, Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from PySide6.QtWidgets import *
>>> self = QGraphicsRectItem()
>>> crash = QGraphicsTextItem(parent=self)
>>> 

Quite apart from that, please do not get your hopes up too high: I would like to point out that this NoGIL PySide simply works, without a net and double bottom. There is no other protection against threading than the one Sam has built into Python. PySide needs to go a long way through thorough testing and replace the GIL protection of PySide objects accordingly. This has not yet been addressed. (I would like to work on this though... πŸ˜† ).

However, I will try to craft a NoGIL version of the PySide-Mandelbrot example shortly.

ctismer commented 1 year ago

Hi @jimkring @colesbury As promised, I have adapted a PySide Mandelbrot program for use with NoGil. The implementation is a converted Qt example and not in the best style for Python. I tried to keep the changes to a minimum for comparable results.

My version takes the display QImage and breaks the computation into 1-pixel stripes. Then a ThreadPoolExecutor operates on that. The result is not as great as I expected, probably because the thread pool causes quite a lot of overhead.

I tried different thread pool sizes. Funnily, maxworkers=4 works best (fastest). The first generations are accelerated by a factor of >3. I was not able to do further analysis because of threading and opcode issues.

So, this result is partly nice, partially disappointing, because of the worse performance with increased core number. If there is interest, I can upload the scripts, of course. But in order to see maximum performance, I will try another implementation where I split the display matrix into not much more tiles tiles than cores.

ctismer commented 1 year ago

Hi again, I have modified the example so that I also can choose a stripe size, in order to make the computed chunks larger. This works fine, but seems to have little effect. The unclear fact stays still true:

@colesbury Can you imagine what that behavior means? Do I have a glaring bug, or is the function concurrent.futures.as_completed a problem?

I am uploading the source files now. mandelbrot.py.txt mandelbrot_nogil.py.txt

The machine used was a Mac Studio Max with 10 cores and 64 GB main memory, btw.

colesbury commented 1 year ago

@ctismer I'm seeing speedups with 8 threads for larger number of iterations:

EDIT: Oops I swapped the results from 4 and 8 threads.

colesbury commented 1 year ago

Nevermind - I swapped the results from 4 and 8 threads. Really strange that higher number of iterations scale worse with more threads...