Open trallard opened 1 year ago
Aiming to meet this deliverable within the next 2-4 weeks. Several projects have support (NumPy, PyWavelets, Pandas, scikit-learn), others are in the pipeline (scikit-image, Zarr, Awkward, hopefully also Matplotlib at least). A few others started but on hold due to higher priority items.
Meeting the deliverable won't be the end of it, but we should switch to deploying working interactive docs for a few more projects first, to accelerate the feedback cycle.
We're getting there! Thanks for adding the detailed issue tracker @agriyakhetarpal
Pyodide's alpha releases for 0.27 are now up, @rgommers – should we now look at https://github.com/zarr-developers/zarr-python/pull/1903 again or wait a bit until we have the stable release a short while after?
should we now look at zarr-developers/zarr-python#1903 again or wait a bit until we have the stable release a short while after?
The action there is to make async tests for Zarr v3 work, which doesn't depend directly on that PR but (if I understand correctly) is infra work within Pyodide. If there's nothing higher on your prio list, trying to understand that in more detail and moving it forward would be useful I think.
Initially, this was slightly difficult back when I started with the Pyodide ecosystem, but we've got statsmodels
's support backported via https://github.com/statsmodels/statsmodels/pull/9365 so that it could get fast-tracked for inclusion in a new v0.14.4 release with no other changes today :) Both last month and this month involved and will involve a bit of travel and conferences respectively, so we should be able to close the "official" target of five projects down in early November (including Zarr, from the above discussion).
Here is a bit of extra context for any other potential readers besides Ralf and me:
scikit-learn
, scikit-image
, and statsmodels
; they are always beneficial to do to get close to upstream SciPy. It could get out-of-tree CI support as its upstream FORTRAN 77 rewrites proceed and when the number of patches is sized down in both size (number of SLOC) and intrusiveness (we could consider both metrics related in this context). Hence, while the table mentions that out-of-tree CI builds are planned, in-tree updates coupled with the occasional PRs with patches that get to go into SciPy upstream is a reasonable way to add support.h5py
's out-of-tree builds because of in-browser usability reasons.Two questions on the above:
id_dist
was Cythonised (patched in Pyodide now) and how LBFGSB was rewritten (not yet patched). One way to evaluate which rewrites to backport would be to see which and how many tests a particular rewrite allows us to un-skip, since rewrites would be included in the next SciPy release anyway.libhdf5.a
in the cross-build environment to make it available for out-of-tree linkage, similar to how NumPy includes libnpymath.a
and the relevant header files in xbuildenv/site-packages-extras/
? And when we unvendor packages' (and libraries') recipes, their updates will become faster because they will get decoupled with Pyodide.Decoupling recipes in the medium term would make us have to bother a bit less with the first question, too: the rewrites get included in subsequent SciPy releases, which are not in sync with the Pyodide releases, since the timelines have always been and would continue to be different, so some PR that is going to benefit, say, SciPy v1.16 users would be nice to backport to SciPy v1.15 in Pyodide if Pyodide has an upcoming release (i.e., before SciPy's v1.16's upcoming release). That said, there are other reasons besides the difference in release timelines for why the act of porting these rewrites is useful, I believe, which are covered in the question.
SymPy added as a potential target as discussed on 11/10/2024.
we should be able to close the "official" target of five projects down in early November (including Zarr, from the above discussion).
Great to see that!
Is it worth spending time occasionally backporting SciPy's upstream rewrites in Pyodide downstream and un-skipping WASM tests as a result? I feel the answer should be "yes",
I'd say probably not, since this is mostly extra work (and not just an hour or less) that is anyway going to land in Pyodide. I'd prefer to see time spent on more structural improvements.
2. we can look into including the Emscripten-compiled
libhdf5.a
in the cross-build environment to make it available for out-of-tree linkage, similar to how NumPy includeslibnpymath.a
and the relevant header files inxbuildenv/site-packages-extras/
?
I really want to get rid of libnpymath.a
- shared libraries that cross package boundaries are a really bad idea in Python wheels, and we've had a lot of trouble with it over the years. So my first inclination here is to say that this probably isn't a step in the right direction.
It seems like we've met the deliverables here. There's a few more PRs that look close (e.g. scikit-image wheels PR looks like the code is written, it's just waiting on Pyodide 0.27) and more improvements are always nice, but for the record let's declare victory here:) Issue can stay open for tracking purposes.
more improvements are always nice, but for the record let's declare victory here:)
Yay! Here's to victory! Yes, I'd keep the issue open, too, since there are a few niceties that I'd like to clean up with, such as pyodide/pyodide-actions#12, which is a nice-to-have but not urgent at all.
📝 Summary
Expand the CI support for cross-compiling to Pyodide/WebAssembly to at least five projects.
🚀 Tasks / Deliverables
TBD
📅 Estimated completion
24 months milestone
📋 Additional information
Status
awkward
andawkward-cpp
scikit-learn
scikit-image
statsmodels
python-flint
(dependency of SymPy) WASM builds left – discussion underway in https://github.com/flintlib/python-flint/issues/234h5py
and libhdf5