Add Attention Sinks (TVM portion)

This API allows caller to choose 1. how many slots to use as sinks, and 2. how much to trim the cache to.

Callers can pick a low number like in the paper, or something to keep the entire system command.
The typical sliding window approach would call this function after every append, and trim to max_window_size. For better performance, callers can trim more frequently and aggressively.

mlc-ai / relax