Open dashanji opened 6 months ago
During the integration with llama.cpp, we find the import_kv_cache_buffer and export_kv_cache_buffer show bad performance as follows.
The Import time includes query vineyard + import_kv_cache_buffer. The Export time includes update vineyard + export_kv_cache_buffer.
query vineyard
import_kv_cache_buffer
update vineyard
export_kv_cache_buffer
/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.
Describe your problem
During the integration with llama.cpp, we find the import_kv_cache_buffer and export_kv_cache_buffer show bad performance as follows.
The Import time includes
query vineyard
+import_kv_cache_buffer
. The Export time includesupdate vineyard
+export_kv_cache_buffer
.