-
**Describe**
I found that after finetuning with Lora, the token throughput is significantly reduced. I trained a model on the unit test generation. And then fused the Lora adapter.
For my test dat…
-
I have this GPU: AMD RADEON RX 7900 XTX
ramalama pull "quay.io/ramalama/rocm:latest"
When I try a model, there is always a crash.
e.g.
$ ramalama run llama3.2
"llama-cli killed by SIGSEGV"
cmdline:
…
-
This is the best open source vision model i have ever tried , We need support for it in ollama
-
Source:
We need to update both [model_prices_and_context_window.json](https://github.com/BerriAI/litellm/blob/2a5624af471284f174e084142504d950ede2567d/model_prices_and_context_window.json) and [mo…
-
I am trying to full fine tune Llama3.2-1b to "teach" it another language (via continous pretraining).
he idea is to have a model, which, given a prompt in a language , it continues the sentences in…
-
Hello,
I am using Ollama as the LLM backend, but I am encountering issues when trying to run certain models. Here are the specific problems:
Llama3 and Phi3.5 Models: These models are not suppor…
-
### System Info
GPU name (NVIDIA A6000)
TensorRT-LLM tage (v0.9.0 main)
transformers tage (0.41.0)
### Who can help?
@nc
### Information
- [X] The official example scripts
- [X] My own modified…
-
**LocalAI version:**
2.14.0
**Environment, CPU architecture, OS, and Version:**
Linux Ubuntu SMP PREEMPT_DYNAMIC x86_64 x86_64 x86_64 GNU/Linux
90GB RAM 22 vcores
nvidia L4 24GB
**De…
-
Hello,
I successfully downloaded the model to this directory /root/.llama/checkpoints/Llama3.2-1B-Instruct
When I launch the AutoModelForCausalLM.from_pretrained passing the path above I got the f…
-
Hi!
I'm using:
- Ubuntu 24.04
- clang version 18.1.8
- python 3.9.19 with pyenv (2.4.1)
- cmake 3.28.3
When I execute the command after downloading the model from Hugging Face:
```
python setup_env.p…