pypa / cibuildwheel

🎡 Build Python wheels for all the platforms with minimal configuration.
https://cibuildwheel.pypa.io
Other
1.88k stars 239 forks source link

Pyodide builds: better support the use of Node.js-based script runners in `CIBW_TEST_COMMAND` #1878

Open agriyakhetarpal opened 5 months ago

agriyakhetarpal commented 5 months ago

Description

Hi there! Recently, #1456 was merged and made available in version 2.19.0 through which experimental WebAssembly/Pyodide builds for packages can be performed using --platform pyodide or through setting the CIBW_PLATFORM environment variable to pyodide.

It would be great to possibly configure how the wheels are tested – for some packages such as scikit-image, statsmodels, and scikit-learn (cc: @lesteve), there are some issues with symbol visibility which leads to some bugs pertaining to unresolved symbols, especially those that come from a BLAS library such as OpenBLAS (see https://github.com/pyodide/pyodide/issues/3865), where it is better to use a Node.js-based testing script (preceded by the npm install pyodide@<pyodide version> command) to run pytest-based test suites inside Node.js, where these issues do not occur.

Steps to reproduce

It is difficult to provide a minimal reproducer for this behaviour, however, the "Additional context" section below has a working CI example that can be helpful. The test environment that gets created installs Node.js and other machinery to render a Pyodide testing environment successfully, however, copying the built wheel to a suitable directory using the {wheel} placeholder does not work as smoothly – it is tricky to configure where the .node_modules/ folder gets created and I faced problems trying to install a wheel even when pyodide was installed by npm under {project}.

Proposed solution

Here's how I can envision this working right now, it might need a few refinements:

cibuildwheel opens up an environment variable CIBW_TEST_COMMAND_PYODIDE_RUNNER: that can be set to either xbuildenv or node, where the node option will establishes a test_package.js file that will contain some boilerplate code based on https://github.com/scikit-learn/scikit-learn/blob/main/build_tools/azure/pytest-pyodide.js or https://github.com/scikit-image/scikit-image/blob/main/tools/emscripten/pytest_scikit_image_pyodide.js.

Packages listed in CIBW_TEST_REQUIRES can then be iterated over and installed via this snippet (I'm using pytest and matplotlib, for example):

await pyodide.runPythonAsync("import micropip; micropip.install('pytest')");
await pyodide.runPythonAsync("import micropip; micropip.install('matplotlib')");

inside the script, and pytest can receive arguments based on Node.js' process.argv, while the rest of the script can load Pyodide from the CDN, and construct a NODEFS file system to copy the repaired/built wheel to (where micropip installs the wheels and the test requirements). Therefore, this can be abstracted away under CIBW_TEST_COMMAND: pytest --pyargs <mypackage> or similar, where CIBW_TEST_COMMAND_PYODIDE_RUNNER can govern how to run the test command.

Additional context

This feature request comes from https://github.com/scikit-image/scikit-image/pull/7440, where I am trying to use cibuildwheel for scikit-image, which builds without problems, but fails because there is no reliable way to use Node.js because of the separate virtual environment that gets created by cibuildwheel to run the tests (which exists for good reason), but also that cibuildwheel installs scikit-image at the time of testing, which I cannot take advantage of because Pyodide will exit with a fatal error and not proceed with the test suite further.

I understand that this could be too niche of a request to ask for since this is not an issue for the majority of packages out there even when considering compiled ones, however, this is a problem that is likely to stay for a while and has existed since some of the past few Pyodide versions. There could be more, but I know of at least three packages that face this issue, which are: scikit-image, scikit-learn, and statsmodels, as mentioned above. Considering that cibuildwheel is the de facto standard to build wheels but also as a precursor to push them to nightly indices such as Anaconda.org (xref: https://github.com/pyodide/pyodide/issues/3049), this would be valuable to implement. If it helps, I would be happy to try this out and put together a PR if I can receive a few pointers across the codebase – if this is deemed by the maintainers to be in scope for cibuildwheel, that is.

Edit: also cc @hoodmane and @ryanking13 for visibility.

Build log

Please see some of the recent runs under https://github.com/agriyakhetarpal/scikit-image/actions/workflows/nightly_wheel_build.yml

CI config

https://github.com/scikit-image/scikit-image/pull/7440

hoodmane commented 5 months ago

Let's see if we can fix the symbol visibility bugs instead.

agriyakhetarpal commented 2 months ago

Update: symbol visibility bugs are now resolved with Pyodide's in-tree updates to SciPy for both statsmodels (https://github.com/statsmodels/statsmodels/pull/9343) and scikit-image (https://github.com/scikit-image/scikit-image/pull/7525, pending review/merge).

This is now a little less relevant, but probably still relevant for the scikit-learn test suite. It would be nice to keep this open if someone wants this later in the future, and I'll be happy to help implement it myself.

lesteve commented 2 months ago

@agriyakhetarpal I haven't had time to double-check but I am guessing that scikit-learn test suite should run fine inside pyodide venv on Pyodide 0.27.0a2. A scikit-learn PR doing this would be more than welcome :wink:!

agriyakhetarpal commented 2 months ago

Yes, @lesteve! Already on it since yesterday – I can confirm that all tests for scikit-learn pass now 😁 I'll have a PR ready by today, hopefully.