Open kou opened 2 months ago
Why is this needed to do an out-of-source build? Is that only relevant for artifacts that are generated we want to move out of the docker image later, like documentation artifact? But in that case, another solution can also be to only ensure those artifacts are generated outside of the source?
Oh, sorry. I had a typo in the description:
-We should use out-of-source build to create files in source tree on host.
+We should use out-of-source build to avoid creating files in source tree on host.
It's for avoiding creating files in source tree on host. If files are created in Docker container, root
owned files are created on host. They can't be removed by a normal user. It may break a build on host.
For Python dev versions were we extract the version based on the git describe
command it gets rather annoying to do an out of source build. We might be able to map the uid:gid of the local user to the container on docker so it maps as a non-root user on the host instead of doing out of source builds for everything.
It's for avoiding creating files in source tree on host.
I understood that. But my question is still: why is that needed in practice (except for artifacts like docs)? You mention "They can't be removed by a normal user. It may break a build on host.", but did we have such issues in the past? (it has been done in-source forever)
As Raúl mentions, this is quite annoying for the python build which assumes to be either in the git repo, or otherwise built from an sdist which has the version encoded in its files (but so not from a plain copy of the sources)
I can't remember details but I had some problems when I use python/
in-source on host. (I used sudo rm ...
or something for the case. But it may be wrong. I can't remember...)
(I mix archery docker run ...
(for debugging CI failures) and python3 setup.py ...
/python3 -m pip ...
on host but others may not mix them.)
We can map uid:gid but is there any portable way for it? I hope that it's enabled by default.
git describe
related problem, right?Can we use GIT_DIR
for it?
diff --git a/ci/scripts/python_build.sh b/ci/scripts/python_build.sh
index 9455baf353..80fd417644 100755
--- a/ci/scripts/python_build.sh
+++ b/ci/scripts/python_build.sh
@@ -25,6 +25,8 @@ build_dir=${2}
source_dir=${arrow_dir}/python
python_build_dir=${build_dir}/python
+export GIT_DIR=${arrow_dir}
+
: ${BUILD_DOCS_PYTHON:=OFF}
if [ -x "$(command -v git)" ]; then
If we can remove --no-build-isolation
from https://github.com/apache/arrow/blob/8169d6e719453acd0e7ca1b6f784d800cca4f113/ci/scripts/python_build.sh#L88-L92, we can remove https://github.com/apache/arrow/blob/8169d6e719453acd0e7ca1b6f784d800cca4f113/ci/scripts/python_build.sh#L81-L86 . Can we remove --no-build-isolation
by #41041 ?
Yeah, I don't use our docker builds very often locally, so can't say much about that.
If we can remove
--no-build-isolation
I would think that the build isolation should not matter for whether files are generated in the source or not (this is about whether a temporary python venv is created, or whether your current python session is used, while building), although exactly what pip/setuptools do depending on certain flags passed can be quite difficult to guess.
But, I think it should be possible to specify to pip to use a build
directory that lives outside of the source (without copying the full source itself), maybe that might help?
I think by default pip will create a build directory in python/build
(https://github.com/pypa/pip/issues/10695)
Looking a bit further into it, pip
was actually defaulting to an "out-of-source" build in the past, and only switched to in-tree builds by default the last two years (https://pip.pypa.io/en/stable/topics/local-project-installs/#build-artifacts). But so indeed, now it does an in-tree build and doesn't allow to specify a build directory, that's the responsibility of the build backend (setuptools) AFAIU. And for reading some issues related to this (eg https://github.com/pypa/build/issues/446, https://github.com/pypa/setuptools/issues/1816), it seems this is not easily configurable.
So in short, if we want to have the same out-of-source build as we had with older pip, it seems that you indeed need to do that manually yourself
Thanks for looking into it. I see.
Describe the enhancement requested
If we use in-source build, we have files owned by
root
in source tree on host. Because we useroot
in Docker containers.We should use out-of-source build to avoid creating files in source tree on host.
At least
python/
,js/
andjava/
use in-source build.Component(s)
Continuous Integration