Open zuston opened 3 days ago
i suggest using a property other than spark.io.compression.codec
since it is used in broadcast/shuffle where data goes through the network. for local spilling we would like to use a lightweight compression algorithm like lz4/snappy.
i prefer a property like blaze.spill.compression.codec
, what do you think?
and have you done some benchmark using zstd spilling? it will get worse performance than lz4/snappy, if i don't understand wrong.
i suggest using a property other than
spark.io.compression.codec
since it is used in broadcast/shuffle where data goes through the network. for local spilling we would like to use a lightweight compression algorithm like lz4/snappy. i prefer a property likeblaze.spill.compression.codec
, what do you think?
Another option is acceptable.
and have you done some benchmark using zstd spilling? it will get worse performance than lz4/snappy, if i don't understand wrong.
Haven't. I'm still reading this part code.
And I think we still can reuse the IoCompressionReader/Writer . WDYT? @richox
Is your feature request related to a problem? Please describe.
In current codebase, the lz4 codec is used in the spill. zstd should be supported.
Additional context
I will do this if no rejection from project owner.