-
Hello,
I saw [this](https://github.com/facebookresearch/xformers/tree/main/examples/llama_inference) example code with a llama model, I tried to replicate it with my own tinyllama model (leaded as PE…
GenVr updated
11 months ago
-
I'm trying to use this README: https://github.com/pytorch/executorch/tree/main/examples/models/llama2 on other llama2 based models like TinyLlamas 1.1B: https://huggingface.co/TinyLlama/TinyLlama-1.1B…
-
### System Info
System information:
Container is Debian12 (mambaorg/micromamba)
Host is RHEL9 / ppc64le
```shell
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME…
-
I hope to see a tinyllama example included with a custom conversational dataset, particularly with Chat ML. Or how do I achieve this with the provided tinyllama "default" example? My goal is to demons…
-
## Ask your question here:
Hello! I am working on an integration between Kserve/Knative with vLLM for deploying LLMs. vLLM is a production inference server for LLMs, and I have instrumented…
-
### Your current environment
```text
```
### 🐛 Describe the bug
Hi @Isotr0py @mgoin,
I ran the gguf inference example [gguf_inference](https://github.com/vllm-project/vllm/blob/main/examples/…
-
I got local inference working but when I try to use workers I get this error.
dllama inference --model dllama_model_tinyllama_1_1b_3t_q40.m --tokenizer dllama_tokenizer_tinyllama_1_1b_3t_q40.t --bu…
-
I trained model using Accelerate+Deepspeed ZeRO-2 and got a ZeRO-2 checkpoint. The checkpoint structure is listed below. And this is the Google Drive [link](https://drive.google.com/drive/folders/1e…
-
Hello again :) I have an issue with how the relevancy scores are computed for some LLaMA models for sequence classification.
I have a classification task that I am using in the following screensho…
-