currently blocks are capped at 65K records - i think it might make sense to create larger blocks (perhaps up to half a million or million records) to save on repeated string hashing operations when doing GROUP BY operations in a query.
one way of doing this is creating megablocks: a megablock would be a block consisting of multiple sub-blocks that are fully defined. further thinking required on how to use a megablock to reduce the number of string hash operations required.
currently blocks are capped at 65K records - i think it might make sense to create larger blocks (perhaps up to half a million or million records) to save on repeated string hashing operations when doing GROUP BY operations in a query.
one way of doing this is creating megablocks: a megablock would be a block consisting of multiple sub-blocks that are fully defined. further thinking required on how to use a megablock to reduce the number of string hash operations required.