gptneox Search Results - Githubissues

207 results
for gptneox

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #3038

[BUG] Checkpoint loading gpt-neoxt-chat-base-20b not working

**Describe the bug** Following #2547 I tried to run the model gpt-neoxt-chat-base-20b, which is a neox-20B derivative I think and I think it should work. Inference works if the model is loaded the n…

thies1006 updated 1 year ago
2
triton-inference-server/fastertransformer_backend #67

How to support different models with different tensor_para_s…

I have 4 GPUs and 3 models called small, medium and large. I want to deploy small model on GPU 0, medium model on GPU 1, and large model on GPU 2 and GPU3 with tensor_para_size=2 due to large model is…

TopIdiot updated 1 year ago
29
ggerganov/llama.cpp #3293

GPT-NeoX has only minimal inference support

Steps to reproduce: 1. Download https://huggingface.co/EleutherAI/gpt-neox-20b 2. Convert the model and attempt to use it: ``` $ TMPDIR=/var/tmp ./convert-gptneox-hf-to-gguf.py gpt-neox-20b 1 --ou…

cebtenzzre updated 1 month ago
12
mudler/LocalAI #897

Not building on windows locally

**LocalAI version:** #895 **Environment, CPU architecture, OS, and Version:** sh-5.2$ uname -a MSYS_NT-10.0-19045 DESKTOP-S7HQITA 3.4.7-ea781829.x86_64 2023-07-05 12:05 UTC x86_64 Msys …

bartanderson updated 11 months ago
1
NVIDIA/TensorRT-LLM #1187

L4 Llama2/Mistral gets CUDA error when benchmarking (not whe…

### System Info - CPU architecture: x86_64 - CPU/Host memory size: 126G - GPU properties - GPU name: L4 - GPU memory size: 24GB - Libraries - TensorRT-LLM branch or tag (e.g., main, v0.…

robmsmt updated 3 days ago
5
ggerganov/llama.cpp #7805

Bug: QWEN2 quantization GGML_ASSERT

### What happened? When attempting to quantize [Qwen2 7B instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) to IQ2_XS I get the following assert: ``` GGML_ASSERT: ggml-quants.c:12083: gri…

bartowski1182 updated 12 hours ago
72
mudler/LocalAI #810

WSL Ubuntu - " Build Error cp: cannot stat 'CMakeFiles/ggml.…

**LocalAI version:** 1.22.0 **Environment, CPU architecture, OS, and Version:** WSL Ubuntu via VSCode Intel x86 i5-10400 Nvidia GTX 1070 Windows 10 21H1 uname -a output: Linux DESKTO…

nibiru5 updated 3 months ago
7
ggerganov/llama.cpp #2706

Gpt neox model that was converted and quantized to gguf refu…

I converted Astrid 1b CPU (https://huggingface.co/PAIXAI/Astrid-1B-CPU) to gguf and quantized it. Then i tried to run it using "main -m 1B/ggml-model-q4_1.gguf -n 128" and got this error: error loa…

JohnClaw updated 3 months ago
11
SparkJiao/llama-pipeline-parallel #6

How Can I use your code to load LLama2?

Hi, thanks for great work! I want to use your code to build a `PipelineModule` object from LLama2. Here is my code: ```python def load_model(neox_args): config = transformers.AutoConfig.…

fmh1art updated 4 months ago
5
huggingface/transformers #30523

Llama Attention Call should not pass **kwargs

### System Info - `transformers` version: 4.40.1 - Platform: Linux-4.18.0-513.24.1.el8_9.x86_64-x86_64-with-glibc2.28 - Python version: 3.10.13 - Huggingface_hub version: 0.22.2 - Safetensors ver…

kiddyboots216 updated 2 weeks ago
13

上一页 1...4 5 6 7 8 9 10...21 下一页

207 results for gptneox

207 results
for gptneox