gptneox Search Results - Githubissues

207 results
for gptneox

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/FasterTransformer #465

GPTNeoX get wrong output when start_ids is long

### Branch/Tag/Commit main ### Docker Image Version nvcr.io/nvidia-pytorch:22.07-py3 ### GPU name A100 ### CUDA Driver 450.156.00 ### Reproduced Steps ```shell 1. download …

TopIdiot updated 1 year ago
5
h2oai/h2ogpt #87

NVIDIA Triton inference support

https://github.com/triton-inference-server/ - [x] Build Triton Docker image with support for FasterTransformer backend for Fusion etc. - [x] convert h2oGPT models to format that Triton understands h…

arnocandel updated 1 year ago
1
NVIDIA/TensorRT-LLM #1805

How to test the time to new token of a model in Tensorrt-llm

I found that in the benchmark/suite has the output time to first token. However, when I run `python benchmark.py --model meta-llama/Llama-2-7b-hf static --isl 128 --osl 128 --batch 1` an error occurs:…

Ourspolaire1 updated 2 weeks ago
7
NVIDIA/FasterTransformer #588

GptNeoX got an invalid result when the callback was set.

@byshiue ### Branch/Tag/Commit main ### Docker Image Version nvcr.io/nvidia/pytorch:21.11-py3 ### GPU name TITAN ### model https://huggingface.co/TabbyML/NeoX-1.3B ### Repr…

zhang-ge-hao updated 1 year ago
1
NVIDIA/FasterTransformer #674

Are there plans to support INT8 PTQ for other models (GPTNeo…

Hello, It seems that currently int8 weight only and SmoothQuant quantizations are supported for GPT models, but no kind of quantization is supported for other autoregressive transformer models, suc…

aitorormazabal updated 8 months ago
1
NVIDIA/FasterTransformer #540

GPT-NeoX HuggingFace Converter does not work

### Branch/Tag/Commit main ### Docker Image Version not-specific-to-docker-image ### GPU name all GPUs ### CUDA Driver n/a ### Reproduced Steps ```shell Merely running the example at https://…

ankit-db updated 1 year ago
17
microsoft/DeepSpeed-MII #198

Return code -9 for OPT with 8x40GB A100 GPUs

Hello, I'm running the following code snippet in `opt.py`. ``` import mii mii_configs = {"tensor_parallel": 8, "dtype": "fp16", "load_with_sys_mem": True} mii.deploy(task="text-generation", …

Mutinifni updated 1 year ago
15
bojone/NBCE #5

将模型改成chatglm2后，效果极差

@bojone 将模型改成chatglm2后，虽无报错但是效果极差，还望能够协助解决！以下是生成的运行输出： Loading checkpoint shards: 100%|██████████████████| 7/7 [00:08

cxj01 updated 5 months ago
3
Zodiark-ch/Emergence-of-LLMs #1

Arg.py bugs and optimization of code

import os import pickle from typing import List from dataclasses import field, dataclass from utils import set_default_to_empty_string FOLDER_ROOT = ( os.path.abspath(os.path.dirname(os.pa…

Grokci updated 1 month ago
3
togethercomputer/OpenChatKit #135

When use one Gpu do model training, met one issue.

bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh My configurations changes as below: --lr 1e-5 --seq-length 2048 --batch-size 8 --micro-batch-size 1 --gradient-accumulate-step 1 \ --num-layers…

yxy123 updated 11 months ago
3

上一页 1...1 2 3 4 5 6 7...21 下一页

207 results for gptneox

207 results
for gptneox