gpt2-inference-performance Search Results

236 results
for gpt2-inference-performance

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

jyaacoub/MutDTA #93

Look into how to use DeepSpeed for inference instead of FSDP

Relevant links: - inference docs: https://deepspeed.readthedocs.io/en/latest/inference-init.html - Getting started tutorial: https://www.deepspeed.ai/tutorials/inference-tutorial/ - init_distribute…

jyaacoub updated 5 months ago
3
abetlen/llama-cpp-python #1687

ggml_cuda_host_malloc: system has unsupported display driver…

# Prerequisites I have searched and tried for a week now. # Expected Behavior I am installing five llms on a A100 40GB GPU, each running a model of 6GB (which is the llama-3 8B instruct mode…

HansvanHespen updated 1 month ago
2
ggerganov/llama.cpp #8387

Bug: [SYCL] Qwen2 MoE: 0 layers offloaded to GPU

### What happened? I am using Llama.cpp + SYCL to perform inference with Qwen2 MoE. The prediction output seems normal, but the following lines in the debug log indicates that the model is not offloa…

ch1y0q updated 1 month ago
6
PINTO0309/onnx2tf #595

Error when trying to run converted model; gpt2 tensorflow fr…

### Issue Type Others ### OS Linux ### onnx2tf version number 1.19.11 ### onnx version number 1.15.0 ### onnxruntime version number 1.16.3 ### onnxsim (onnx_simplifier) version number 0.4.3…

flores-o updated 6 months ago
1
yangkevin2/doc-story-generation #1

Some questions

Hi I just read about DOC and have some questions. I am developing in python since a few years but never really worked with AI yet. But this project is fascinating me because it may be the help/assista…

gelsas updated 1 year ago
10
otwld/ollama-helm #88

Ollama Not detecting NVIDIA GPU

### What happened? Unable to get ollama to use the GPU for processing, I am following the guide [provided](https://github.com/otwld/ollama-helm), Pod log attached below, not detecting GPU but runni…

sixshakybones updated 1 month ago
3
ollama/ollama #6423

Running on MI300X via Docker fails with `rocBLAS error: Coul…

**Steps to reproduce:** 1. Run a Docker container using `ollama/ollama:rocm` on a machine with a single MI300X 2. Inside the container, run `ollama run llama3.1:70B` **Actual behaviour:** ``` …

peterschmidt85 updated 1 month ago
9
Tribler/tribler #7435

Msc placeholder: exploring LLM as a database

_placeholder for brainstorm_. Finished all master courses. (part-time side job) Exploring for 1 month what a good master thesis direction is around LLM. Draft master thesis (again placeholder): *…

synctext updated 1 month ago
45
microsoft/DeepSpeed #5247

Unable to load fp6 QuantizationConfig via DeepSpeedInference

**Describe the bug** Unable to use/test fp6 quantization in deepspeed 0.14 in inference mode on a GPT2 model. There is little documentation on usage right so not sure if I have the wrong init metho…

Qubitium updated 6 months ago
2
ollama/ollama #6093

Only one of the dual CPUs is in use

### What is the issue? My machine has two CPUs without GPUs, and when I run the model, I find that the CPUs are used at most 50% ![PixPin_2024-07-31_16-11-31](https://github.com/user-attachments/ass…

Mipuqt updated 2 months ago
8

上一页 1...8 9 10 11 12 13 14...24 下一页

236 results for gpt2-inference-performance

236 results
for gpt2-inference-performance