-
Hey guys, I am using `GPTQ` to quantize the `GPT-NeoX-20B` model. Previously, when quantizing the Llama Family model, I usually used `C4` as the example tokenizer. May I ask which dataset is suitable …
-
**Describe the bug**
Running inference with Deepspeed using GPT-NeoX 20B model produces garbage output, indicating an implementation bug.
**To Reproduce**
For example, can be seen when using exam…
-
First of all, thank you for great work.
## System info
autoawq==0.1.8
## Details
While I tried to quantize GPT NeoX model, encountered the error below.
```
>>> from awq import AutoAWQForCa…
-
Hey guys
Today I was doing quants of a [new GPTNeoX model called Literature-7B-16384](https://huggingface.co/hakurei/Literature-7B-16384)
I tried making GGMLs through the usual process:
```
py…
-
Hello,
I am interested and volunteering to convert the models from GPT-NeoX to HuggingFace format.
-
System Info
GPU: NVIDIA RTX 4090
TensorRT-LLM 0.13
quest 1: How can I use the OpenAPI to perform inference on a TensorRT engine model?
root@docker-desktop:/llm/tensorrt-llm-0.13.0/examples/apps# pyt…
-
My server cannot connect to the Hugging Face website, so I manually downloaded the pretrained model used in the code and placed it in the `img2img-turbo-main` folder. After executing the command `pyth…
-
### System Info
Optimum Habana: 1.10.4
Synapse: 1.14.0
Dockerfile:
```
FROM vault.habana.ai/gaudi-docker/1.14.0/ubuntu22.04/habanalabs/pytorch-installer-2.1.1:latest
# Installs pdsh and upg…
-
**Describe the bug**
We conducted tests on OPT/GPTJ/GPT-Neox/BLOOM 7B INT8, these models are all producing garbage outputs on DeepSpeed 0.8.1
OPT model is NCCL communication issue
GPT-…
-
### Feature request
I encountered a KeyError while loading the phi3-v vision model into Optimum Huggingface. The error message states:
```
KeyError: 'phi3-v model type is not supported yet in Nor…