Open h-vetinari opened 1 year ago
Noticed the following the logs (on main, passing builds, https://github.com/conda-forge/ray-packages-feedstock/runs/10863439621). Pretty interesting stuff!
2023-01-24T21:20:07.3483964Z INFO: Analyzed 2 targets (153 packages loaded, 21326 targets configured).
2023-01-24T21:20:07.3519661Z INFO: Found 2 targets...
2023-01-24T21:20:07.4415336Z [0 / 9] [Prepa] BazelWorkspaceStatusAction stable-status.txt
2023-01-24T21:20:16.3821713Z [14 / 1,961] Compiling src/google/protobuf/compiler/cpp/cpp_field.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:27.6915280Z [21 / 1,961] Compiling src/google/protobuf/compiler/cpp/cpp_message.cc; 6s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:37.1167440Z INFO: From Compiling src/google/protobuf/message_lite.cc:
2023-01-24T21:20:37.1191266Z In file included from /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/string.h:638,
2023-01-24T21:20:37.1192104Z from external/com_google_protobuf/src/google/protobuf/stubs/port.h:39,
2023-01-24T21:20:37.1192619Z from external/com_google_protobuf/src/google/protobuf/stubs/common.h:48,
2023-01-24T21:20:37.1193112Z from external/com_google_protobuf/src/google/protobuf/message_lite.h:45,
2023-01-24T21:20:37.1193612Z from external/com_google_protobuf/src/google/protobuf/message_lite.cc:36:
2023-01-24T21:20:37.1194141Z In function 'void* memcpy(void*, const void*, size_t)',
2023-01-24T21:20:37.1194953Z inlined from 'uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)' at external/com_google_protobuf/src/google/protobuf/io/coded_stream.h:706:16,
2023-01-24T21:20:37.1196208Z inlined from 'virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
2023-01-24T21:20:37.1197520Z inlined from 'bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/message_lite.cc:412:30:
2023-01-24T21:20:37.1199102Z /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/bits/string3.h:51:33: warning: 'void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)' specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2023-01-24T21:20:37.1200145Z 51 | return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
2023-01-24T21:20:37.1200619Z | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-01-24T21:20:40.0947346Z [37 / 1,961] Compiling src/google/protobuf/compiler/csharp/csharp_reflection_class.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:53.8654923Z [55 / 1,961] Compiling src/google/protobuf/compiler/java/java_map_field.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:09.8994672Z [74 / 1,961] Compiling src/google/protobuf/compiler/plugin.pb.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:27.9441159Z [95 / 1,961] Compiling src/google/protobuf/io/printer.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:48.7495496Z [111 / 1,961] Compiling src/google/protobuf/compiler/objectivec/objectivec_primitive_field.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:22:13.6192243Z [138 / 1,961] Compiling src/google/protobuf/extension_set.cc; 2s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:22:41.9635614Z [186 / 2,150] Compiling src/compiler/node_generator.cc [for host]; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:23:14.0442354Z [231 / 2,150] Compiling src/google/protobuf/generated_message_tctable_lite.cc [for host]; 1s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:23:50.5721139Z [268 / 2,150] Compiling src/google/protobuf/any.pb.cc [for host]; 0s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:24:32.5789497Z [310 / 2,150] Compiling src/google/protobuf/compiler/java/java_primitive_field.cc [for host]; 1s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:24:58.1737035Z INFO: From Compiling src/google/protobuf/message_lite.cc [for host]:
2023-01-24T21:24:58.1780300Z In file included from /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/string.h:638,
2023-01-24T21:24:58.1781514Z from external/com_google_protobuf/src/google/protobuf/stubs/port.h:39,
2023-01-24T21:24:58.1782102Z from external/com_google_protobuf/src/google/protobuf/stubs/common.h:48,
2023-01-24T21:24:58.1791404Z from external/com_google_protobuf/src/google/protobuf/message_lite.h:45,
2023-01-24T21:24:58.1792308Z from external/com_google_protobuf/src/google/protobuf/message_lite.cc:36:
2023-01-24T21:24:58.1793098Z In function 'void* memcpy(void*, const void*, size_t)',
2023-01-24T21:24:58.1794152Z inlined from 'uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)' at external/com_google_protobuf/src/google/protobuf/io/coded_stream.h:706:16,
2023-01-24T21:24:58.1795623Z inlined from 'virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
2023-01-24T21:24:58.1797153Z inlined from 'bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/message_lite.cc:412:30:
2023-01-24T21:24:58.1798928Z /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/bits/string3.h:51:33: warning: 'void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)' specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2023-01-24T21:24:58.1800554Z 51 | return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
2023-01-24T21:24:58.1801107Z | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-01-24T21:25:20.9009284Z [416 / 2,281] Compiling src/compiler/csharp_generator.cc [for host]; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:25:22.1662427Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/reflection/v1alpha/reflection.grpc.pb.h:
2023-01-24T21:25:22.1669560Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:51.6341538Z INFO: From Generating Descriptor Set proto_library @com_github_cncf_udpa//xds/type/v3:pkg:
2023-01-24T21:25:51.6347480Z xds/type/v3/typed_struct.proto:10:1: warning: Import validate/validate.proto is unused.
2023-01-24T21:25:52.6425750Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/channelz/channelz.grpc.pb.h:
2023-01-24T21:25:52.6442802Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7080488Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/percent.grpc.pb.h:
2023-01-24T21:25:52.7082213Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7434268Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/base.grpc.pb.h:
2023-01-24T21:25:52.7444334Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7844053Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/config_dump.grpc.pb.h:
2023-01-24T21:25:52.7858832Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.8146701Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/csds.grpc.pb.h:
2023-01-24T21:25:52.8195413Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:26:17.3034890Z [719 / 2,496] Compiling src/idl_gen_rust.cpp [for host]; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:27:24.9172630Z [1,843 / 3,534] Compiling python/ray/_raylet.cpp; 52s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:28:39.2553777Z [1,954 / 3,534] Compiling src/google/protobuf/wire_format.cc; 3s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:30:05.1828782Z [2,045 / 3,534] Compiling src/ray/common/bundle_spec.cc; 8s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:31:43.7234689Z [2,136 / 3,534] Compiling src/cpp/server/server_cc.cc; 4s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:33:36.9333443Z [2,285 / 3,534] Compiling src/ray/raylet/node_manager.cc; 13s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:35:49.1076702Z [2,396 / 3,534] Compiling src/ray/core_worker/core_worker_process.cc; 9s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:38:19.0661531Z [2,533 / 3,534] Compiling src/ray/raylet/scheduling/policy/bundle_scheduling_policy.cc; 12s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:41:11.3956382Z [2,723 / 3,534] Compiling src/ray/raylet/agent_manager.cc; 16s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:44:29.9164559Z [2,954 / 3,534] Compiling src/core/ext/filters/client_channel/lb_policy/grpclb/grpclb.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:48:19.1225621Z [3,208 / 3,534] Compiling src/ray/core_worker/transport/direct_task_transport.cc; 15s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:52:42.6329482Z [3,508 / 3,534] Compiling src/core/lib/iomgr/tcp_posix.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:52:54.5383055Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/health/v1/health.grpc.pb.h:
2023-01-24T21:52:54.5384340Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:57:45.8430389Z [3,778 / 3,792] Compiling src/ray/gcs/gcs_server/gcs_actor_manager.cc; 5s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T22:03:36.1203303Z [4,049 / 4,054] [Prepa] Linking cpp/libray_api.lo
2023-01-24T22:03:39.2502551Z INFO: Elapsed time: 2670.604s, Critical Path: 223.47s
Ahh, hang on, that is the warning that is failing the aarch64 builds in #92. So it was there all the time and the difference is a -Werror
or so?
Do we have any bazel experts around who could either remove the build altogether or figure out how to ignore that error?
Do we have any bazel experts around who could either remove the build altogether or figure out how to ignore that error?
You mean in conda-forge or upstream? We will try to adapt this build to make it work (with bazel). We have a specific toolchain that we likely have to use https://github.com/conda-forge/bazel-toolchain-feedstock (an example of using this toolchain successfully is jaxlib, and the tensorflow build relies on a similarly modified toolchain)
Do you have specific needs, @mattip? I am planning to attempt fixing this in the coming weeks, but I can also try make some effort sooner
Our bazel expert is @xhochy who may not be free these days (we miss you if you see this!)
It seems tensorflow has a whole scheme to allow using system libraries. Is this build deps the parallel in ray? How would that look for a local grpc?
We use this sort of thing in jaxlib: https://github.com/conda-forge/jaxlib-feedstock/blob/77c8ef863a48afae4654c4adc5962232f807cf8e/recipe/build.sh#L70
We also tend to edit bazelrc files like this: https://github.com/conda-forge/jaxlib-feedstock/blob/77c8ef863a48afae4654c4adc5962232f807cf8e/recipe/build.sh#L14-L26
Another good example is the tensorflow build: https://github.com/conda-forge/tensorflow-feedstock/blob/main/recipe/build.sh
The main issue for me is whether or not we will have to do a lot of deep patching to get this to work. I am not that familiar with the build setup of ray yet
We use this sort of thing in jaxlib
That passes TF_SYSTEM_LIBS
down to tensorflow, which has a whole scheme to allow using system libraries. This mechanism does not exist so far in ray.
Is anyone still working on this? Having such an old version pinned here is starting to cause some problems for us.
Is anyone still working on this? Having such an old version pinned here is starting to cause some problems for us.
Not that I'm aware of. Please feel free to have a go and tag people in this issue so that we can keep track and help if we could
Ray 2.4.0 pins to <1.49 like upstream ray on darwin. Would changing to exactly the upstream pinning (<1.51.3
on non-darwin) help your use-case?
Ray 2.4.0 pins to <1.49 like upstream ray on darwin. Would changing to exactly the upstream pinning (
<1.51.3
on non-darwin) help your use-case?
Yes, this would help a lot actually.
Good news: dealing with external deps in bazel might finallyyyyyyy be getting easier: https://github.com/conda-forge/tensorflow-feedstock/issues/332
It requires bazel 6, which does not seem to work. See ray-project/ray#31504
Yeah, but compatibility with modern bazel is mostly just a question of time. The important update here IMO is the new capabilities that'll allow to finally improve the (un)vendoring situation here.
@mattip is there anything preventing an update of ray to 2.9.0. that should bring grpcio version to 1.59
EDIT: we should guard at <1.59 not pin it.
I found this article about using native libaries in bazel builds. It seems we could add some patches to replace the grpc and protobuf builds with the conda-provided ones?
I think we could try to use these from conda: libabseil
(instead of @com_google_absl/*
), 'gprc' (instead of @com_github_grpc_grpc/*
) 'protobuf' (instead of @com_google_protobuf
), all from upstream BUILD.bazel
The biggest problem with this is how hard it is (for me at least) to tell bazel to use "foreign" libraries.
Originally posted by @h-vetinari in https://github.com/conda-forge/ray-packages-feedstock/issues/87#issuecomment-1376923344