Closed slvrtrn closed 6 months ago
@mshustov
What change gave the performance boost?
The decode
function did a few unneeded passes on a huge string. It is way less efficient than the stream transformer we already had in the ResultSet.
Unless JIT optimizes it out (it does not), it's at least one extra full pass, worst case two.
Summary
Improved performance when decoding the entire set of rows with streamable JSON formats (such as
JSONEachRow
orJSONCompactEachRow
) by calling theResultSet.json()
method. Depending on the dataset, it's between 10-15% (like cell_towers) and 40% (with large rows and very long strings) less execution time. NB: The actual streaming performance when consuming theResultSet.stream()
(which was fast) hasn't changed. Only theResultSet.json()
method used a suboptimal stream processing in some instances, and nowResultSet.json()
just consumes the same stream transformer provided by theResultSet.stream()
method.Before:
After:
Removed the outdated
decode
function.Updated exported types and doc entries for DataFormat.
Fixed weird flakiness in expect-type assertions