streamingllm Search Results

117 results
for streamingllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mit-han-lab/streaming-llm #1

For LLMs already trained with window attention and BOS token

Nice work! I am wondering whether this attention sink magic is still needed for LLMs that has been already trained with window attention (e.g. [mistral](https://github.com/mistralai/mistral-src)). …

GeneZC updated 9 months ago
6
QwenLM/Qwen #422

window-attn 默认是否开启?

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing…

ShadowTeamCN updated 9 months ago
10
mit-han-lab/streaming-llm #2

How do you feed long texts to a model?

I tried naively to add examples in https://github.com/mit-han-lab/streaming-llm/blob/main/data/mt_bench.jsonl, including examples with length of 4k tokens, without changing anything in the script. I r…

CorentinvdBdO updated 9 months ago
3
mit-han-lab/streaming-llm #8

Google Colab installation

Hi https://colab.research.google.com/drive/1YtXE_JKVntkGK14Yo9thjCjPMVzhA71d?usp=sharing Here is the colab, but it doesn't run in colab it stops after a while due to memory overload or something…

narita63755930 updated 9 months ago
10
NVIDIA/TensorRT-LLM #1397

What are the advantages of int8_kv_cache?

About int8_kv_cache I did some tests： > Test model is mistral-7b > My test inference code comes from `run.py`, supplementing runner.generate's time-consuming statistics，Added warm up code. > Input…

sirodeneko updated 3 months ago
6
oobabooga/text-generation-webui #5406

Installation characteristics.

### Describe the bug Installation characteristics. 1. How long does it take to install the program? 2. How much free space is required for installation? 3. On which disks is it installed? 4. In…

jhon65496 updated 3 months ago
2
mit-han-lab/streaming-llm #18

Metal Support

`raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled` Any plans to get Metal support for us M2 users without CUDA? Thanks!

jordo1138 updated 9 months ago
7

上一页 1...6 7 8 9 10 11 12...12 下一页

117 results for streamingllm

For LLMs already trained with window attention and BOS token

window-attn 默认是否开启?

How do you feed long texts to a model?

Google Colab installation

What are the advantages of int8_kv_cache?

Installation characteristics.

Metal Support

117 results
for streamingllm