streaming-tokenizer Search Results

1000+ results
for streaming-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

irthomasthomas/undecidability #679

Understanding the kernel in Semantic Kernel | Microsoft Lear…

- [ ] [Understanding the kernel in Semantic Kernel | Microsoft Learn](https://learn.microsoft.com/en-us/semantic-kernel/agents/kernel/?tabs=python) # Understanding the kernel in Semantic Kernel | Mi…

irthomasthomas updated 6 months ago
1
OpenBMB/MiniCPM-V #437

[BUG] <title>Using LLM Engine to infer the MiniCPM-V-2_6 mod…

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing…

orderer0001 updated 1 month ago
3
ufal/whisper_streaming #109

Unable to get sentences, only segments

I use whisper-timestamped backend because I could not get the faster-whisper working. I also use whisper_streaming in a Python code and simulate the online mode by processing 1-second long pieces of a…

Denis-Kazakov updated 1 month ago
5
golang/go #40128

proposal: encoding/json: garbage-free reading of tokens

As @bradfitz noted in the reviews of the original API, the `Decoder.ReadToken` API [is a garbage factory](https://go-review.googlesource.com/c/go/+/11651/2/src/encoding/json/stream.go#279). Although, …

rogpeppe updated 11 months ago
18
javirandor/anthropic-tokenizer #5

Project Broken

>The idea is simple. Ask Claude to repeat some text and observe how the generation is streamed through the network. It turns out that Anthropic serves one token at a time! ``` python anthropic_tok…

honey-tree updated 1 month ago
1
joblib/joblib #227

Support caching of generators

Some generators are expensive to compute but can be re-used, perhaps functions which return generators can be cached (say, every time the generator's `__next__` is called or until `StopIteration`) so …

simonzack updated 9 years ago
9
exo-explore/exo #133

使用exo+mlx多台mac运行llama-3.1-70b,返现量化时报错[BUG]

使用exo+mlx多台mac运行llama-3.1-70b,返现量化时报错报错的位置: quantized.py文件代码: def call(self, x): s = x.shape x = x.flatten() out = mx.dequantize( self["weight"][x], scales=self["scales"][x], biases=self["…

wjwc updated 1 month ago
3
triton-inference-server/tensorrtllm_backend #88

Segmentation fault in tritonserver streaming inference with …

**Description** I deployed a triton backend of Baichuan TensorRT engine successfully, but got segmentation fault error during streaming inference **Triton Information** I start the triton contain…

yingjie1011 updated 8 months ago
10
tloen/alpaca-lora #75

Inference Hangs

Hello, Thank you for sharing your work. I'm interested in evaluating alpaca-lora on QA tasks. I started with BoolQ dataset. I followed the `generate.py` script and constructed a prompt that work…

HaniItani updated 1 year ago
12
mistralai/mistral-inference #207

[BUG: config.json in mamba-codestral-7B-v0.1 is error

### Python -VV ```shell Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] ``` ### Pip Freeze ```shell accelerate==0.33.0 addict==2.4.0 annotated-types==0.7.0 apex @ file:///data2/apex …

Fly-Pluche updated 1 month ago
2

上一页 1...17 18 19 20 21 22 23...100 下一页

1000+ results for streaming-tokenizer

1000+ results
for streaming-tokenizer