streaming-tokenizer Search Results

1000+ results
for streaming-tokenizer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Lightning-AI/litdata #179

Batch size beginning to vary half way through epoch

## 🐛 Bug Hello, I'm running into an issue where my batch size begins to vary half way through an epoch. ### To Reproduce I logged when it deviated from 64. It happens in all epochs, and when trai…

MarcoForte updated 3 weeks ago
7
ml-explore/mlx-examples #808

SPMStreamingDetokenizer sometimes outputs incorrect multi-by…

Using the microsoft/Phi-3-medium-128k-instruct model, I received incorrect responses for multi-byte characters (commonly seen in Japanese or Chinese), as shown below: ``` mlx_lm.generate --model mic…

wil24 updated 2 months ago
1
mudler/LocalAI #2521

Transformers backend supports mps

**LocalAI version:2.16.0 **Environment, CPU architecture, OS, and Version:** mac studio M2 Ultra **Describe the bug** using backend transformers for glm4, trust_remote_code: true not c…

aotsukiqx updated 1 month ago
1
huggingface/tokenizers #1546

"Solution" to memory hogging in train_new_from_iterator with…

Hi So I was training a new tokenizer from Llama Tokenizer (meta-llama/Llama-2-7b-hf), on a medium sized corpus (Fineweb-10BT sample : 15 million documents with average length of 2300 characters). A…

morphpiece updated 2 days ago
7
ml-explore/mlx-examples #745

Model doesn't know when to stop generating.

I am relatively new to running inference on my own. Previously, I used ollama, but recently I decided to try out mlx since I have an M3 with sufficient unified memory and I was curious about how it co…

ahmetkca updated 3 months ago
5
BerriAI/litellm #2417

[Feature]: Accurate token count for claude-3 streaming model…

### The Feature Hi! The tokenizer you are using for claude-3 is not accurate, the correct numbers are output in the chunks (first chunk for prompt token and last chunk for response token. Proposal…

atishay-sarvam updated 4 months ago
3
NVIDIA/ChatRTX #31

Unable to stream results. TypeError: 'NoneType' object is no…

I am attempting to build a chatbot using TrtLlmAPI as the llm ``` llm = TrtLlmAPI( model_path=trt_engine_path, engine_name=trt_engine_name, tokenizer_dir=tokenizer_dir_path, …

qbm5 updated 2 months ago
1
AutoGPTQ/AutoGPTQ #448

How to achieve autogptq model streaming output

I want to achieve autogptq model streaming output. Now i just achieve output like this input_ids = tokenizer.encode(inputs, return_tensors="pt", …

wengyuan722 updated 8 months ago
7
triton-inference-server/client #682

Incomplete installation of all genai-perf dependencies preve…

When `genai-perf` is installed using `pip` from Github (as documented), on first run it tries to download several files from Huggingface, like this: ``` $ docker run --rm -it --name test -u 0 gpu-tr…

mirekphd updated 1 month ago
2
xorbitsai/inference #1962

GLM-4 chat 9b:'ChatGLMForConditionalGeneration' object has n…

### System Info / 系統信息 python 3.11.8 ### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？ - [ ] docker / docker - [X] pip install / 通过 pip install 安装 - [ ] installation from source / 从源…

Dravenlll updated 12 hours ago
9

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for streaming-tokenizer

1000+ results
for streaming-tokenizer