Remove block objects from cache object meta to reduce the meta size.

Before this optimize: With kv_state_cache_benchmark_tes.cc(In https://github.com/v6d-io/v6d/pull/1816) and the config with:

constexpr int TENSORBYTES = 80;
constexpr int CAPACITY = 20000;
constexpr int LAYER = 64;
constexpr int BLOCK_SIZE = 100;

And the token list length is 1900;

So, the llm cache object contains 370 * 128 = 47360 members.( c means the number of block object contained in the cache object)

After this optimize, the cache object do not make the block as its member. So, the object with the largest number of members is cache block, which has 128 members

v6d-io / v6d

Remove block objects from cache object meta to reduce the meta size. #1806

What do these changes do?

Related issue number