-
### Describe the bug
I am trying to use the multimodal model `wojtab_llava-13b-v0-4bit-128g` on Windows using CUDA.
(further developments of this issue in my comments bellow)
### Is there an ex…
-
### Describe the bug
While using the Silero TTS extension, I encountered an error when providing long text inputs. The model seems to have a limitation on the length of the input text it can handle.
…
-
Just for clarification. What does 34B Full Tuning with 4 A100 mean in that table? support for PPO or DPO or both? Have you tested for train 34B llama DPO on 8*A100?
-
Hello,
I am testing AirLLM with model based on LlaMa-2. I successfully created splitted model. But when run inference, it got error. My code is below:
```
import torch
from airllm import AirLL…
-
this is a cool project. I can make it run well on my meteorlake system.
btw, would you kindly provide a RAG (Retrieval Augmented Genration) example that can refer to external documents using RAG te…
-
I am using AMD Radeon RX 6700M with Rocm 5.6.1
Traceback (most recent call last):
File "/home/pet/git/h2ogpt/venv/lib/python3.10/site-packages/gradio/queueing.py", line 495, in call_prediction…
-
### Describe the bug
Hello,
I downloaded this model:
https://huggingface.co/TheBloke/llava-v1.5-13B-AWQ
Then I am trying to use the 'multimodal' extension, and load it using the AutoAWQ load…
-
### Discussed in https://github.com/orgs/eosphoros-ai/discussions/1026
Originally posted by **manishparanjape** January 4, 2024
Followed these instructions: https://docs.dbgpt.site/docs/insta…
-
I was trying to use the generate API for Llama 2 using the same code from this example:
https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-dev…
-
I am getting a cuda out of memory error when querying against All data in a collection, the collection is large and has more data than what can fit in the context, I'd expect h2ogpt to take enough dat…