gpt2-inference-performance Search Results

236 results
for gpt2-inference-performance

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #2746

RuntimeError: 'weight' must be 2-D while training Flan-T5 mo…

I am using Huggingface Seq2SeqTrainer for training Flan-T5-xl model with deepspeed stage 3. ``` trainer = Seq2SeqTrainer( #model_init = self.model_init, model=se…

smitanannaware updated 3 weeks ago
25
pytorch/pytorch #77799

MPS device appears much slower than CPU on M1 Mac Pro

### 🐛 Describe the bug Using MPS for BERT inference appears to produce about a 2x slowdown compared to the CPU. Here is code to reproduce the issue: ```python # MPS Version from transformers i…

mmisiewicz updated 3 weeks ago
82
pytorch/pytorch #105840

[FSDP] FSDP doesn't work (random accuracy performance) when …

### 🐛 Describe the bug Currently, when using FSDP, the model is loaded for each of the N processes completely on CPU leading to huge CPU RAM usage. When training models like Flacon-40B with FSDP on…

pacman100 updated 1 year ago
19
ggerganov/llama.cpp #9818

Bug: !!Severly Performance Degration when Using llama.cpp to…

### What happened? Hi, When I use llama.cpp to deploy a pruned llama3.1-8b model, a unbearable performance degration appears: We useing a structed pruning method(LLM-Pruner) to prune llama3.1-8b, w…

gudehhh666 updated 19 hours ago
1
likelovewant/ollama-for-amd #14

6750GRE 12g

[ollama-for-amd [v0.3.4] OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-09T00:25:59.140+08:00 level=INFO source=images.go:782 msg="total blobs: 5" time=2024-08-09T00:25:59.141+08:00 level=INFO…

21307369 updated 2 weeks ago
9
ollama/ollama #1016

Support AMD GPUs on Intel Macs

I'm currently trying out the ollama app on my iMac (i7/Vega64) and I can't seem to get it to use my GPU. I have tried running it with num_gpu 1 but that generated the warnings below. ` 2023/11/…

J0hnny007 updated 1 week ago
91
onnx/onnx-mlir #2769

Performance degradation in Models running on NNPA

The following models are taking longer when running on NNPA now compared to the 0.4.1 release. * gpt2-10.onnx * about 30% worse * 0.4.1 - Total runMainGraph() time over all 100 infere…

cjvolzka updated 6 months ago
9
vllm-project/vllm #387

pipeline parallel support in the future？

I wonder will you support pipeline parallel in the future？If the answer is yes, maybe the whole system need to be designed again?

irasin updated 1 month ago
14
PKU-DAIR/Hetu-Galvatron #3

Error when train galvatron with global mode CUDA error: unc…

I set the environment variables as follow in train_dist.sh in gpt_hf folder: ``` export NUM_NODES=1 export NUM_GPUS_PER_NODE=8 export MASTER_ADDR=localhost export MASTER_PORT=2222 export NODE_RA…

CannonWWW updated 3 months ago
2
pytorch/xla #5658

Extremely slow performance with StableHLO

## 🐛 Bug StableHLO performance currently seems to be 2 orders of magnitude worse than the normal XLA flow. ## To Reproduce Please try the following script: ```python import os import timei…

SamKG updated 4 months ago
13

上一页 1...7 8 9 10 11 12 13...24 下一页

236 results for gpt2-inference-performance

236 results
for gpt2-inference-performance