-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Community Note
* Please vote on this issue by adding a :thumbsup: [reaction](https://blog.github.com…
-
### 🚀 The feature, motivation and pitch
Is it possible to somehow fix some of the KV cache for common instruction + specific part of the prompt so that it is reused across multiple inferences of the …
-
**Describe the problem**
To prevent overloading the underlying store and to reduce the impact of rangefeeds on the foreground workload, the number of catchup iterators (pebble iterators used for ca…
-
In `performance_optimization/prompt_reuse.py`, the current method of storing the cached prompt does not correctly discard the KV cache for the last token (and instead follows the same caching recipe a…
-
**Is your feature request related to a problem? Please describe.**
The [hrana protocol](https://github.com/tursodatabase/libsql/blob/main/docs/HRANA_3_SPEC.md) specification is a standard protocol …
Ehesp updated
3 weeks ago
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I'm implementating a custom algorithm that requires a custom generate met…
-
first proof of concept, https://github.com/mafintosh/smalltable
-
## Description
The idea is to provide simple kv-store for other third party apps on IC to consume yral infra by keeping their app stateless and dumping data to individual canisters that are maintai…
-
https://developers.cloudflare.com/d1/platform/pricing/
cloudflare launches d1 recently. It is a cloud native sqlite solution. The free tier allows 100000 writes per day. This is 100x more than kv.
…
-
- needs to be part of grid driver
- data needs to be compressed then encrypted (first compressed)
- set/get/delete based on a filepath (where data is) and name passed as argument
- tfchain doesn'…