Open zlobober opened 8 months ago
I concur that the read and write need to be streaming gRPC apis; the gcs_grpc driver is likely a reasonable model for the kvstore driver code.
Note that default gRPC server GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH is 4MB; the chunk size should be configurable.
This issue is related to grpc_kvstore driver, which is currently not available for using with vanilla tensorstore, but was discussed in the issue https://github.com/google/tensorstore/pull/134 and is going to become public.
ML teams in my company are using a privately patched version of tensorstore with grpc_kvstore driver, whose backend is implemented by YTsaurus. Current grpc protocol imposes the following problem: both Read and Write requests are one-shot, which limits the length of a value by 2 GiB, which is a fundamental upper limit of Protobuf message size. Also, it is not a good idea in general to write read or write large blobs within a single request, because it is not a fault-tolerant solution.
Our proposal is to make Read and Write requests respectively server-side and client-side streaming methods, limiting the size of a single message with something reasonable like 32 MiB. It seems that this change may be done in a backward-incompatible manner, as grpc_kvstore interface is not public and stable yet.
If you are ok with this proposal, we would be glad to bring a PR implemeting this idea.