So the Encoder will produce Gob chunks, and when Batcher tries to find out the format, it'll get ChunkEncodingFormat_UNKNOWN. So this shouldn't be treated as an error; instead, it should be treated as Gob.
E1-B0: the Encoder returns Gob and with the chunk_encoding_format set as GOB in the response; the Batcher doesn't care about chunk_encoding_format and just get the chunks and treat them as Gob, which is correct
E1-B1: similar to E1-B0, it's just the Batcher will pass the chunk_encoding_format to Dispatcher, which will send chunk to Node as it is
Why are these changes needed?
We need the batcher to fallback to Gob chunk encoding, when chunks returned from Encoder do not have encoding format specified.
The reason is this case:
So the Encoder will produce Gob chunks, and when Batcher tries to find out the format, it'll get ChunkEncodingFormat_UNKNOWN. So this shouldn't be treated as an error; instead, it should be treated as Gob.
Compatibility/correctness reasoning
Let's denote the versions of Encoder as:
ENCODER_ENABLE_GNARK_CHUNK_ENCODING
asfalse
ENCODER_ENABLE_GNARK_CHUNK_ENCODING
astrue
And versions of Bather as:
BATCHER_ENABLE_GNARK_CHUNK_ENCODING
asfalse
BATCHER_ENABLE_GNARK_CHUNK_ENCODING
astrue
The reasoning and testing of the compatibility of 9 combinations:
ChunkEncodingFormat_UNKNOWN
, but correctly falls back to Gobchunk_encoding_format
set as GOB in the response; the Batcher doesn't care aboutchunk_encoding_format
and just get the chunks and treat them as Gob, which is correctchunk_encoding_format
to Dispatcher, which will send chunk to Node as it isENCODER_ENABLE_GNARK_CHUNK_ENCODING
astrue
chunk_encoding_format
set to GNARK; and then Batcher will understand it's GNARK, but it'll convert them from GNARK to GOB on the fly before sending to Node, at https://github.com/Layr-Labs/eigenda/blob/cacdc21d65e8634ab3ced4e6c2dcb6ef0fdd2515/disperser/batcher/grpc/dispatcher.go#L393Testing
All of the above combinations tested in preprod.
Checks