Closed mowoe closed 1 year ago
Hi, just chiming in briefly. I wasn't fully able to debug this problem, but I'll share what I know.
I don't believe test-bazel.sh#167 is the issue - the targets are ok. There seems to be an issue with the package structure. Can you upload the wheel you produced somewhere so I can inspect it?
Hi @rstz,
thanks for your reply. Here are the wheels i built (github only supports zips): tensorflow_decision_forests-1.3.0-cp310-cp310-linux_aarch64.whl.zip tensorflow_decision_forests-1.3.0-cp310-cp310-manylinux_2_28_aarch64.whl.zip
Thanks! It looks like build_pip_package.sh is either not compiling or not properly copying over the ydf. The relevant lines are L119-L127 of build_pip_package.sh. Could you please check if the necessary files are there before copying (in particular the python files like data_spec_pb2.py
in bazel-bin/external/ydf/yggdrasil_decision_forests/dataset/
)
Thank you so much for the hint @rstz ! The actual problem turned out to be that my minimal debian image did not include rsync (as expected) and the build script did not fail but rather just didnt execute the commands. Now that i added rsync, i was able to build the arm wheel successfully: tensorflow_decision_forests-1.3.0-cp310-cp310-manylinux_2_28_aarch64.whl.zip Sorry for wasting your time!
yay! š„³
If you're building this wheel for a specific project that you can share (either publicly or via email to me), feel free to do so, we're happy to know what people are working on with TF-DF š
@rstz actually i built the arm wheel for a specific project, but it might be a bit underwhelming: š
I use tf-df for my numer.ai model. Currently i have to run an ipython notebook every other day which is a bit tedious, but numerai supports calling a webhook when submissions are due. Currently only aws SageMaker FaaS stuff is documented, which i did not want to use for a number of reasons. Instead i wanted to use fission, which is an open-source k8s FaaS framework. As i am doing all of this for fun and no profit, i didnt feel like spending any money on managed k8s like gke or eks. Oracle Cloud Infrastructure supports a three-node managed k8s cluster in its forever-free tier, which i have used for other projects. This has one major drawback though: The compute instances are ARM instances. This is why i needed an arm wheel.
TL;DR i automated a ~2min task by spending hours trying to build the arm wheel for a made-up problem š
i automated a ~2min task by spending hours trying to build the arm wheel for a made-up problem š
I love it š
Thank you for reporting back, sharing the wheel you build and good luck in the competition!
Hi! I am trying to build an arm wheel, which is a lot more challenging than i originally thought. Currently im building the wheel while building a Docker image (see below for the Dockerfile). This Dockerfile builds the wheel just fine, but the build seems to be missing some things:
I suspect build rules defined in test-bazel.sh#167 are not correct, but i am not experienced enough with bazel to find out the correct build rules.
Due to multiple issues, some pretty ugly patching is neccessary:
patched_gcc
(avx not availible in arm docker):test_bazel.patch
(Remove all cuda targets from tensorflow):build_pip_package.patch
:Sidenote: Setting
FULL_COMPILATION=1
causes the build to fail because of some unrelated tensorflow issues and shouldnt be necessary to build the library. As far as I can see it is the same issue described here. In any case, the error is in upstream tensorflow.