Closed parthchandra closed 2 months ago
@andygrove FYI. This will build an uber jar but does not have the script to deploy. That script can be a different PR. Note: This includes support for MacOS binaries but that part does not actually work correctly because the build breaks on compiling Blake3. The MacOS build is skipped if the XCode library is not provided.
MacOS build hits this - https://github.com/BLAKE3-team/BLAKE3/issues/180. Will try the suggested solutions.
@andygrove I changed the script to build a different image for each architecture instead of a single multi-arch image. It makes things much simpler at the cost of having multiple images (and a small increase in building time). It also removes the need to have the local container store and will also work with a custom docker backend as long as the backend supports docker build
.
I'm hoping this addresses some of the authentication issues you are seeing.
Also, I have noticed I get some network errors doing the build inside a container when on a VPN. Maybe we could try when not on a VPN?
Anyway, could you take this for a spin?
@andygrove @viirya For the binary builder I chose to use Ubuntu 20.04 as the base image because that is the image we currently use for our published docker images.
Ubuntu 20.04 has glibc 2.31
which means that many redhat based releases will be incompatible because they have an older glibc version. Centos 7 for instance has glibc 2.17
(See: https://gist.github.com/wagenet/35adca1a032cec2999d47b6c40aa45b1)
Should we consider using an older version of Ubuntu? (BTW I tried to build with an older version of glibc but the build kept failing for one reason or the other so I abandoned that effort).
Hmm, I think for OSS Comet we don't have the restriction on supported glibc for platform compatibility. Glibc 2.31 seems to be released on 2020. I think it is old enough for the compatibility of our binary release. For example, Centos 7 is already EOL (https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/)
Ubuntu 20.04 looks like a reasonable choice.
I personally wouldn't want to spend too much efforts on resolving issues on building on older versions of Ubuntu.
I ran the scripts locally and they seem to have worked.
I ran this command:
./dev/release/build-release-comet.sh -r https://github.com/parthchandra/datafusion-comet.git -b binary-build
The resulting jar file contains the following native libs:
% jar tvf spark/target/comet-spark-spark3.4_2.12-0.3.0-SNAPSHOT.jar | grep libcomet
149504624 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/darwin/aarch64/libcomet.dylib
52964152 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/linux/aarch64/libcomet.so
56773320 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/linux/amd64/libcomet.so
The artifact
149504624 Wed Jan 22 15:10:16 MST 2020 org/apache/comet/darwin/aarch64/libcomet.dylib
seems to be a leftover from a manual run. The script will not prepare macos binaries at the moment.
@andygrove thank you for testing! This is ready for review.
@viirya Any further comments?
Looks good to me, with a few minor questions.
Which issue does this PR close?
Closes #721
Rationale for this change
Allows us to publish artifacts to maven
What changes are included in this PR?
Scripts, and Dockerfile to do the binary build in a docker container and include them in an uber jar
How are these changes tested?
Locally.