-
Hi there. I am upgrading my bindings for the lord of llms tool and I now need to be able to vectorize text to embedding space of the current model. Is there a way to have access to the latent space of…
-
My python version:
```bash
python3 --version
Python 3.11.5
```
I came across this issue when trying to install the packages using both `pip` and `pip3`
[
](url)
After I change to `gree…
-
**Is your feature request related to a problem? Please describe.**
VRAM is a major limitation for running most models locally, and guidance by design requires to run models locally to get the most va…
-
I use Yarn-Llama-2-13B-64K-GGUF and set n_ctx = 8192 in ooba webui, but when context exceeds 4096, it replies gibberish. how can i fix it?
the response like
"c [tO r {tk { tO {n----------------a …
-
## Expected Behavior
View description and details of model and model card on clicking the link.
## Current Behavior
All model cards are showing as undefined.
## Steps to Reproduce
Please prov…
-
I got this warning and I'm having trouble understanding what to do. I recently reset my windows 11 pc running an i7-12700, 3060 ti, and 32gb ddr5 ram, and reinstalled python 3.11.7, Visual studio (inc…
-
Is there a way to keep the model loaded in the RAM between successive runs? I have an api like setup, and every time a prompt comes in, the model has to be loaded into RAM again, which takes a while f…
-
Thanks for sharing the model, I have been able to test it on my macbook pro, i9 with 32 GB of ram. I notice that the cpu goes to 400% when inferring the answer, and the gpu goes to 0%. Is it possible …
-
For `atinoda/text-generation-webui:llama-cpu-nightly`:
```
⠹ text-generation-webui Pulling …
-
### Is your feature request related to a problem? Please describe
Support Ctransformers model with chat ability and
### Describe the solution you'd like
Construct a CtransformersChat class that …