-
## Describe the bug
reported by Alan Braz
There are failures seen when fine tuning `llama-7b-model` with certain set of parameters :
```
{
"modelName": "test-llama2",
"parameters": {
…
-
# Trending repositories for C#
1. [**Navi-Studio / Virtual-Human-for-Chatting**](https://github.com/Navi-Studio/Virtual-Human-for-Chatting)
__Live2D Virtual Human for Chatting bas…
-
OS: ubunt
8 vCPU 32 GiB
GPU:NVIDIA V100
root@iZwz98etw3xqaylir1y6pjZ:~/llama# torchrun --nproc_per_node 1 example_text_completion.py --ckpt_dir llama-2-7b-chat/ --tokenizer_path tokenizer.mo…
-
Hi everyone,
I am fine-tuning the llama2, but the loss is declining very slowly, and I am a little confused about the reason. Prior to this, I had fine-tuned the llama1 and the loss dropped signif…
-
That's a very nice work! I want to know if the clean image after LAMA processing will be opened, or the processing script will be opened, I need to train on my dataset, so the processing script is ess…
-
In the ReadMe file, it is mentioned that to run 13B model, MP value should be 2. I have only 1 GPU, is there a way to run this model on single GPU (I am fine if efficiency is lost, what I care as of …
-
Hi @haesleinhuepf , I work at the Max Planck Computing and Data Facility and closely collaborate with @nscherf group. I was wondering if you would be interested in running the benchmark against some b…
-
**Context**
I use Tabby VSCode extension with a local Tabby server.
Currently, when I start VSCode and the Tabby server is not running, it reminds me of that through the yellow indicated extension i…
-
**Issue: Model Error when Setting max_seq_length > 8192**
**Description:**
The `unsloth/codegemma-2b-bnb-4bit` model throws an error when attempting to set `max_seq_length` greater than 8192.
…
-
I am that happy that beam search PR (https://github.com/vllm-project/vllm/pull/857) has been merged into main branch. But I tested 2 models ( llama2-7b and baichuan13b-base ) and found that they gener…