Closed tschm closed 9 months ago
@stinodego FYI
There's also a Microsoft container for rust. Not sure into which direction you want to head with this. Please adjust as you please...
Install rust without any interaction in startup.sh
error: could not compile polars-core
(lib)
after
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
cd py-polars
make build
make test
Dev Container Feature makes it easy to set up Rust. Why not do it?
@eitsupi, I set up rust as described on the Polars web page. What's your suggestion? Using a rust container and install python in there? Using a rust container that comes already with python?
@tschm For example, the following:
{
"image": "mcr.microsoft.com/devcontainers/rust:1-bullseye",
"features": {
"ghcr.io/devcontainers/features/python:1": {
"installTools": true,
"version": "os-provided"
}
}
}
Ok, I have learnt a lot about Rust and the underlying compilers. Here's the script I managed to construct using the documentation
rustc --version
cargo --version
cd py-polars
# both make build and make build-release have the same problem
make build-release
This runs in a container as suggested by @eitsupi above. Unfortunately, the Maturin step keeps failing (I have tried numerous variations). The error message I get is
Compiling polars-row v0.30.0 (/workspaces/polars/polars/polars-row)
Building [======================> ] 331/351: libgit2-sys(build), polar...
Building [======================> ] 331/351: libgit2-sys(build), polar...
Compiling polars-core v0.30.0 (/workspaces/polars/polars/polars-core)
Building [======================> ] 332/351: libgit2-sys(build), polar...
error: could not compile `polars-core` (lib)
Caused by:
process didn't exit successfully: `/usr/local/rustup/toolchains/nightly-2023-06-23-x86_64-unknown-linux-gnu/bin/rustc --crate-name polars_core --edition=2021 /workspaces/polars/polars/polars-core/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --diagnostic-width=80 --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C linker-plugin-lto -C codegen-units=1 --cfg 'feature="abs"' --cfg 'feature="asof_join"' --cfg 'feature="chrono"' --cfg 'feature="chrono-tz"' --cfg 'feature="chunked_ids"' --cfg 'feature="comfy-table"' --cfg 'feature="concat_str"' --cfg 'feature="cross_join"' --cfg 'feature="cum_agg"' --cfg 'feature="dataframe_arithmetic"' --cfg 'feature="default"' --cfg 'feature="diagonal_concat"' --cfg 'feature="diff"' --cfg 'feature="docs"' --cfg 'feature="dot_product"' --cfg 'feature="dtype-array"' --cfg 'feature="dtype-categorical"' --cfg 'feature="dtype-date"' --cfg 'feature="dtype-datetime"' --cfg 'feature="dtype-decimal"' --cfg 'feature="dtype-duration"...
warning: build failed, waiting for other jobs to finish...
@eitsupi I am a bit lost and I am running out of time. I have created a Dockerfile trying to mimic what I have done on my local machine. Note that I can run make build on my local machine. I am for now using a Python docker image and install rustc, cargo, the linker gcc and cmake (via build-essential). I still struggle with the linker?
I add the output:
Building [=======================> ] 340/342: py-polars(build)
Building[=======================> ] 341/342: py-polars error: linking with `cc` failed: exit status: 1
|
= note: LC_ALL="C" PATH="/home/vscode/.rustup/toolchains/nightly-2023-06-23-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/workspaces/polars/py-polars/.venv/bin:/usr/local/python/current/bin:/usr/local/py-utils/bin:/usr/local/share/nvm/current/bin:/usr/local/bin:/usr/local/python/current/bin:/usr/local/py-utils/bin:/usr/local/share/nvm/current/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/vscode/.cargo/bin:/home/vscode/.local/bin" VSLANG="1033" "cc" "-Wl,--version-script=/tmp/rustcJiCchC/list" "-Wl,--no-undefined-version" "-m64" "/tmp/rustcJiCchC/symbols.o" "/workspaces/polars/py-polars/target/debug/deps/polars.11jjymxtx3h3y8la.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.12rch68rz0xi1eb2.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13ag9j0tyjjkezh9.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13bbi9l8qwom7mxu.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13h10f0s5y9s5wuz.rcgu.o...
= note: collect2: fatal error: ld terminated with signal 15 [Terminated]
compilation terminated.
Building [=======================> ] 341/342: py-polars
The gcc compiler seems to be quite old
vscode ➜ / $ gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
@eitsupi I think the linker is not working as my account is filled up. Need to clean a few things and come back to this tomorrow...
Thanks for looking into this. I'm not really experienced with devcontainers so I can't be of much help here.
@stinodego I think a helpful move could be to merge this pull request into a devcontainer branch. You can then setup prebuilds of the image (to make it fast for your users). You keep this for a few days and then merge devcontainer branch into main... To understand what's going on: In .devcontainer Dockerfile I define a standard image that should look very familiar if you have compiled polars from scratch before. The magic happens in the startup.sh program that runs the make build...
@eitsupi I am a bit lost and I am running out of time. I have created a Dockerfile trying to mimic what I have done on my local machine. Note that I can run make build on my local machine. I am for now using a Python docker image and install rustc, cargo, the linker gcc and cmake (via build-essential). I still struggle with the linker?
I add the output:
Building [=======================> ] 340/342: py-polars(build) Building[=======================> ] 341/342: py-polars error: linking with `cc` failed: exit status: 1 | = note: LC_ALL="C" PATH="/home/vscode/.rustup/toolchains/nightly-2023-06-23-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/bin:/workspaces/polars/py-polars/.venv/bin:/usr/local/python/current/bin:/usr/local/py-utils/bin:/usr/local/share/nvm/current/bin:/usr/local/bin:/usr/local/python/current/bin:/usr/local/py-utils/bin:/usr/local/share/nvm/current/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/vscode/.cargo/bin:/home/vscode/.local/bin" VSLANG="1033" "cc" "-Wl,--version-script=/tmp/rustcJiCchC/list" "-Wl,--no-undefined-version" "-m64" "/tmp/rustcJiCchC/symbols.o" "/workspaces/polars/py-polars/target/debug/deps/polars.11jjymxtx3h3y8la.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.12rch68rz0xi1eb2.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13ag9j0tyjjkezh9.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13bbi9l8qwom7mxu.rcgu.o" "/workspaces/polars/py-polars/target/debug/deps/polars.13h10f0s5y9s5wuz.rcgu.o... = note: collect2: fatal error: ld terminated with signal 15 [Terminated] compilation terminated. Building [=======================> ] 341/342: py-polars
I suspect this is simply due to lack of resources on the machine. Because building polars requires a large amount of machine resources, Codespace, with its lowest specs, is clearly underspecified. I recommend that you try with local Docker.
Do you think changing the installation methods from devcontainer.json to a shell script or Dockerfile would contribute anything to resolving things? (I don't think so.)
@eitsupi I have built the underlying Docker image which comes it at approximately 2.8 GB. I am currently cleaning up my use of prebuilds across my profile as I am right at the maximum. Please note that devcontainer.json is pointing to a Docker file.
The image does not contain any polars specific stuff. It's python with rust. Hence it would be faster to host this image in a registry and download from there every time a user is building a devcontainer or using the feature approach you have suggested above.
In a first iteration polars could get away without hosting any prebuilds. However, it would then take 10 minutes to compile Polars on request and most of the convenience would be lost. Those minutes are mainly lost in the compilation of maturin. It's not the installation of rustc, cargo, etc.
Hence I would suggest to start with devcontainers on a dedicated branch and see whether the pola-rs account is powerful enough to host prebuilds.
@tschm My point is that there is probably little advantage to installing Rust and Python without using Dev Container Features. Codespaces uses the Dev Container CLI to build the image, so you can still build the image from the devcontainer.json I presented.
Ok, it works with a proper machine. I have managed to fire up a devcontainer on a 64 GB ram US-West 2 server. @eitsupi I am using a Dockerfile to mimic my local setup and to simplify the local debugging. Going forward I think one could use the feature idea.
Another advice is that if this is only for py-polars instead of rust-polars, place these in .devcontainer/py-polars/
instead of .devcontainer
.
If we want to work only with Rust, we do not need to build py-polars at the container starting point.
@eitsupi I am convinced there is no need for the startup.sh hack. The devcontainer cli probably exposes something nicer. However, I failed with it...
I am convinced there is no need for the startup.sh hack.
I supporse "postCreateCommand": "cd py-polars && make build"
is sufficient?
https://containers.dev/implementors/json_reference/#lifecycle-scripts
(As far as I know, postCreateCommand
is traditionally used for this use case.)
Ok, the devcontainers for rust already contain gcc and cmake.
The devcontainers are screaming for power though.
To try the devcontainers go to
Ok, that was a long D-Tour :-) Thank you @eitsupi. I got far too distracted by the linking issues which were indeed just a result of the weak performance of my laptop and the chosen GitHub server. I learnt a great deal in the process and the result is terse & elegant. Users can now develop code directly in a container exposing all the standard tools needed to do so --- including an IDE. I expose only one devcontainer complete with rust and python in there. However, I don't perform a make build in the construction process of the container. A user can do that if needed (and even return later to his/her container)...
Only two files have been changed. The devcontainer.json file describing the container and the README with a link to the GUI to specify the server used for the container. I recommend the 16 core machine with 64 GB ram.
Please ignore the large number of commits and perform a squeeze if you merge...
Glad to hear you were able to accomplish this!
I think this will work (I haven't tried it here), but have some minor comments.
Thank you @eitsupi. I have fixed the Python version to 3.10 and do not install the extra tools, e.g. pylint. They are defined in the requirements files... There's a problem when I do a "cargo test" from the command line in the devcontainer. It doesn't seem to be a container problem though... Please fire up a container. @ritchie46 Do you feel comfortable with the concept? Shall I explain more what this is good for?
I have fixed the Python version to 3.10
I am not sure if this is a good idea, since ghcr.io/devcontainers/features/python:1
may try to build Python from source. (Please check the source https://github.com/devcontainers/features/tree/main/src/python)
If there is no reason to stick with a particular version of Python, I recommend using os-provided
.
I have fixed the Python version to 3.10
I am not sure if this is a good idea, since
ghcr.io/devcontainers/features/python:1
may try to build Python from source. (Please check the source https://github.com/devcontainers/features/tree/main/src/python) If there is no reason to stick with a particular version of Python, I recommend usingos-provided
.
Ok, good idea. It's the standard choice. I could drop it...
That's a bit surprising...
@tschm ➜ /workspaces/polars (main) $ cargo test
warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
package: /workspaces/polars/polars-cli/Cargo.toml
workspace: /workspaces/polars/Cargo.toml
Compiling polars-ops v0.30.0 (/workspaces/polars/polars/polars-ops)
Compiling polars-core v0.30.0 (/workspaces/polars/polars/polars-core)
Compiling polars-time v0.30.0 (/workspaces/polars/polars/polars-time)
Compiling polars-io v0.30.0 (/workspaces/polars/polars/polars-io)
error[E0433]: failed to resolve: use of undeclared crate or module `serde_json`
--> polars/polars-core/src/serde/mod.rs:13:20
|
13 | let json = serde_json::to_string(&ca).unwrap();
| ^^^^^^^^^^ use of undeclared crate or module `serde_json`
That's a bit surprising...
@tschm ➜ /workspaces/polars (main) $ cargo test warning: profiles for the non root package will be ignored, specify profiles at the workspace root: package: /workspaces/polars/polars-cli/Cargo.toml workspace: /workspaces/polars/Cargo.toml Compiling polars-ops v0.30.0 (/workspaces/polars/polars/polars-ops) Compiling polars-core v0.30.0 (/workspaces/polars/polars/polars-core) Compiling polars-time v0.30.0 (/workspaces/polars/polars/polars-time) Compiling polars-io v0.30.0 (/workspaces/polars/polars/polars-io) error[E0433]: failed to resolve: use of undeclared crate or module `serde_json` --> polars/polars-core/src/serde/mod.rs:13:20 | 13 | let json = serde_json::to_string(&ca).unwrap(); | ^^^^^^^^^^ use of undeclared crate or module `serde_json`
the polars dev workflow isn't guaranteed to work with anything outside of the provided make
commands.
I wonder whether dprint, cmake, fmt etc. should be all preinstalled in the devcontainer. This should probably be addressed in a 2nd issue...
@eitsupi I am now installing the tools required by polars using startup.sh. Please have a look. I assume there would be no need to hardcode the date for the toolchain as I currently do. Also make pre-commit (in polars) and make test (in py-polars) work. However, make test (in polars) does not work. The error message is not exactly helpful (for me):
Compiling polars-json v0.30.0 (/workspaces/polars/polars/polars-json)
warning: unused `Result` that must be used
--> polars/polars-core/src/chunked_array/ops/explode.rs:425:9
|
425 | builder.append_series(&Series::new("", &[1, 2, 3, 3]));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: this `Result` may be an `Err` variant, which should be handled
= note: `#[warn(unused_must_use)]` on by default
The devcontainer will be a rather helpful tool once such problems have been erased...
The startup.sh file
#!/bin/bash
cargo --version
rustc --version
toolchain="nightly-2023-06-23"
# follow https://github.com/pola-rs/polars/blob/main/CONTRIBUTING.md#setting-up-your-environment
rustup toolchain install ${toolchain}-x86_64-unknown-linux-gnu --component miri
# add https://github.com/rust-lang/rustfmt
rustup component add rustfmt --toolchain ${toolchain}-x86_64-unknown-linux-gnu
rustup component add clippy --toolchain ${toolchain}-x86_64-unknown-linux-gnu
# Install dprint, see https://dprint.dev/install/
# this will be slower since it builds from the source
cargo install --locked dprint
# install cmake
sudo apt-get update && sudo apt-get install -y cmake
@eitsupi once more thank you very much. I kept the installation of dependencies in the startup.sh but I agree with the idea that they should go in the long run into a Makefile (which I dare to touch). I am still irritated by the make test
in the container (if applied in the polars folder).
I have managed to run all tests in the polars folders. In a first attempt make test failed as the linker lost its mojo (despite 64 GB ram). A second make test pulled through. All tests passed. The warning message unused Result seems to be accepted. It's not something the devcontainer can or should address. It's just a replication of the environment you work in when you run stuff locally. I think the devcontainer can be most helpful as it is simpler running in such a container than trying to setup the environment needed locally. At the same time it's trivial to work on an extremely powerful machine for very little money...
Apologies, I haven't been able to take a look at this yet. Will try to soon!
@tschm I finally got around to giving this a look. Thanks so much for the effort, but I cannot merge this in its current form.
I tried running this on the latest main branch (with updated Rust version in the startup.sh
), but it fails to create the container, then manages to run it in recovery mode (see logs below). However, cargo
is not available so there's not much you can do.
Running this in VSCode locally from a Windows PC, I get it running on the second try, but performance is absolutely abysmal. Basically, unworkable.
I think to make the devcontainer actually useful, a few things need to happen:
make test
in the py-polars
directory out of the box and it works.I need a PR where I can read the newly added documentation, follow the easy steps, and then get a working environment where I can easily change some code, test it, and whip up a new PR. Without having to go through all the steps I would have to go through when setting up my environment locally.
I don't have much experience with devcontainers, so I cannot really give you any good pointers for setting this up. But hopefully it's clear what I'm looking for.
I will close this for now, though I definitely welcome an updated PR that fulfills the requirements above! Just re-open the PR when it's done.
Log details from GitHub codespaces:
Closes https://github.com/pola-rs/polars/issues/9690