Add page attention manager and kvcache manager

This PR adds two classes, fundamental for page attention in JetStream:

PageAttentionManager:

This class manages and frees page resources, calculates page metadata, and supports cache insertion.

PageKVCacheGenerate:

This class updates decode caches in a page-attention format. Unlike the standard LLM KV cache shape ([batch_size, num_heads, seq_len, head_dim]), PageKVCache uses the shape [num_heads, total_num_pages, page_size, head_dim].

google / jetstream-pytorch

Add page attention manager and kvcache manager #166