When calling ray.put on a large object (size >= 2GB) in client mode, python process segfaults in protobuf 4.x library. Although it works fine with protobuf 3.20.
I have a fix for the segfault in https://github.com/protocolbuffers/upb/pull/1338, but even with that patch ray.put raise DecodeError from protobuf library so it doesn't make it work on large objects.
What happened + What you expected to happen
When calling
ray.put
on a large object (size >= 2GB) in client mode, python process segfaults inprotobuf
4.x library. Although it works fine withprotobuf
3.20.I have a fix for the segfault in https://github.com/protocolbuffers/upb/pull/1338, but even with that patch
ray.put
raiseDecodeError
fromprotobuf
library so it doesn't make it work on large objects.To me it seems that ray should implement chunked put in https://github.com/ray-project/ray/blob/609b8e6151c190d1b5f18b2bfb0d2495b63e994e/python/ray/util/client/worker.py#L498-L514
Versions / Dependencies
Reproduction script
Code minimized from https://github.com/ray-project/ray/blob/609b8e6151c190d1b5f18b2bfb0d2495b63e994e/python/ray/util/client/worker.py#L480-L514 so it doesn't need to include a call to
ray.put
.Issue Severity
Medium: It is a significant difficulty but I can work around it.