[opt] add cache and modify api

Add cache

For example, ListCache(50, 2, fixed_keys=fixed_cache_keys) will cache 50 keys (except fixed keys). Each key is linked to 2 different values. LRU is applied when cache is full. The fixed keys are always in cache and won't be removed.

Modify api

"max_tokens" in request is the same as the number of decode steps now.

hpcaitech / EnergonAI

[opt] add cache and modify api #135

Add cache

Modify api