Open colesbury opened 8 months ago
Hmm, how about generating patchfiles and storing them somewhere in the repository for records? :) I also need feedback about the solution from @vstinner since people in Red Hat is the best expert for this area.
And which commit is our mimalloc baseline code? I can spend time for implementing a tool that generating diff files and comparing them on the CI level if it looks okay :)
cc @DinoV
Hmm, how about generating patchfiles and storing them somewhere in the repository for records? :)
That sounds painful to maintain. I would prefer to have a process to upstream changes first (mimalloc), and then apply them downstream (Python).
Here we wanted to move quickly to add mimalloc and fix all portability issues. But we should be more "upstream first" for following changes.
Ah okay if they accept our changes, it looks reasonable.
I created this issue in anticipation of further mimalloc changes. There will be changes to support GC in the --disable-gil
builds and a few for the optimistic dict accesses. We are going to need to continue to move quickly. An upstream first model is not going to work yet.
I don't think it will be too hard to pull in new mimalloc versions. I've done it a few times in the nogil forks: diff vs. base to get patch file, patch new version, fix merge conflicts. We can write up the workflow, but I think automation is overkill at this point.
I had a nice chat with Daan (mimalloc author) on Friday and we got a chance to discuss some (but not all) of the mimalloc changes to support GC. I think we'll be able to get changes merged upstream, but for non-trivial changes I think it's helpful to understand and be able to explain how APIs are used in CPython. Even if time were not an issue, I would not want to be in a situation where we propose something upstream and then it turns out we needed something slightly different.
CC @daanx
I have started upstreaming mimalloc specific modifications into mimalloc to make it easier to downstream future releases of mimalloc. The ideal goal would be to enable Python to pull in mimalloc with no further modifications -- let's see how far we can get.
For now, I am documenting here which commits fix particular issues together with the version of mimalloc that will have these commits. (last update: 2024-06-03)
gh-112532: Isolate abandoned segments by interpreter #113717 (v2.1.8, various upstream commits, some are: https://github.com/microsoft/mimalloc/commit/d9aa19a7636d457f0b7b50e599649b86e8ade666 (sub-processes), https://github.com/microsoft/mimalloc/commit/8f874555d5d42c4e1006bfc78f6cadfb167b1e30 (visit arena abandoned per sub process), https://github.com/microsoft/mimalloc/commit/855e3b2549e0f2aa0277e43c4eeb8b1cbe1ea497 (all abandoned block visiting), and a concurrency "leak" fix https://github.com/microsoft/mimalloc/commit/96b69d7ef63760629114b97aa47385d898ec4b7e) This now works with the new bitmap based abandoned segments which scales better.
gh-112532: Improve mimalloc page visiting #114133 (v2.1.8, upstream commit https://github.com/microsoft/mimalloc/commit/f7fe5bf20ea8a88f8a55f58549e21dfeadc5dc1f)
(note: would like to check correctness of fast_divide
and the range possible of bsize
)
unused function
warnings during mimalloc build on FREEBSD #111907 (upstream commit https://github.com/microsoft/mimalloc/commit/87c4012f13d06bc92869edc0a79c2eff7343aa6d, v2.1.7)gh-112532: Use separate mimalloc heaps for GC objects #113263 (upstream commit https://github.com/microsoft/mimalloc/commit/710d6138c7c1e31ef3d6871dd42390e85ddc5c5a, v2.1.7) note: renamed _mi_heap_init_ex
to _mi_heap_init
(and the original heap init to mi_thread_heap_init
).
gh-112532: Tag mimalloc heaps and pages #113742 (upstream commit https://github.com/microsoft/mimalloc/commit/0c4041fa53e9d046558d7f5a78b156f87efc5801, v2.1.7) note: added an error condition if page with a certain custom tag cannot be reclaimed in a heap with the same tag; is EINVAL strong enough? maybe make it EFAULT to abort in such case?. It should not happen though if used properly. (edit: I made it EFAULT to catch any bugs early)
[gh-112529: Use _PyThread_Id() in mimalloc (free-threaded build) #115488](https://github.com/python/cpython/pull/115488, v2.1.7) (upstream commit https://github.com/microsoft/mimalloc/commit/66052f135fa3454d3474b217ab191d6395d873da)
For reference, just opened #121487 to address a deprecation warning with ATOMIC_VAR_INIT
which was already fixed upstream in v1.8.4
.
This issue tracks divergence in our copy of mimalloc from upstream https://github.com/microsoft/mimalloc. The purpose is to help with upstreaming fixes and also so that our modifications are not lost as we pull in new mimalloc versions.
base version: v2.1.2
Simple bug fixes
Mimalloc changes to support GC
Mimalloc changes for lock-free reads (using QSBR)