python / cpython

The Python programming language
https://www.python.org
Other
62.17k stars 29.89k forks source link

Mimalloc differences from upstream #113141

Open colesbury opened 8 months ago

colesbury commented 8 months ago

This issue tracks divergence in our copy of mimalloc from upstream https://github.com/microsoft/mimalloc. The purpose is to help with upstreaming fixes and also so that our modifications are not lost as we pull in new mimalloc versions.

base version: v2.1.2

Simple bug fixes

Mimalloc changes to support GC

Mimalloc changes for lock-free reads (using QSBR)

corona10 commented 8 months ago

Hmm, how about generating patchfiles and storing them somewhere in the repository for records? :) I also need feedback about the solution from @vstinner since people in Red Hat is the best expert for this area.

corona10 commented 8 months ago

And which commit is our mimalloc baseline code? I can spend time for implementing a tool that generating diff files and comparing them on the CI level if it looks okay :)

corona10 commented 8 months ago

cc @DinoV

vstinner commented 8 months ago

Hmm, how about generating patchfiles and storing them somewhere in the repository for records? :)

That sounds painful to maintain. I would prefer to have a process to upstream changes first (mimalloc), and then apply them downstream (Python).

Here we wanted to move quickly to add mimalloc and fix all portability issues. But we should be more "upstream first" for following changes.

corona10 commented 8 months ago

Ah okay if they accept our changes, it looks reasonable.

colesbury commented 8 months ago

I created this issue in anticipation of further mimalloc changes. There will be changes to support GC in the --disable-gil builds and a few for the optimistic dict accesses. We are going to need to continue to move quickly. An upstream first model is not going to work yet.

I don't think it will be too hard to pull in new mimalloc versions. I've done it a few times in the nogil forks: diff vs. base to get patch file, patch new version, fix merge conflicts. We can write up the workflow, but I think automation is overkill at this point.

I had a nice chat with Daan (mimalloc author) on Friday and we got a chance to discuss some (but not all) of the mimalloc changes to support GC. I think we'll be able to get changes merged upstream, but for non-trivial changes I think it's helpful to understand and be able to explain how APIs are used in CPython. Even if time were not an issue, I would not want to be in a situation where we propose something upstream and then it turns out we needed something slightly different.

ericsnowcurrently commented 8 months ago

CC @daanx

daanx commented 3 months ago

I have started upstreaming mimalloc specific modifications into mimalloc to make it easier to downstream future releases of mimalloc. The ideal goal would be to enable Python to pull in mimalloc with no further modifications -- let's see how far we can get.

For now, I am documenting here which commits fix particular issues together with the version of mimalloc that will have these commits. (last update: 2024-06-03)

Planned for v2.1.8

Mimalloc changes to support GC

Mimalloc changes for lock-free reads (using QSBR)

upstreamed in v2.1.7:

Simple bug fixes

Mimalloc changes to support GC

cdce8p commented 2 months ago

For reference, just opened #121487 to address a deprecation warning with ATOMIC_VAR_INIT which was already fixed upstream in v1.8.4.