-
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 7
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: …
-
### Here is a list of new possible providers:
_some may have been already implemented_
**gpt3,5 and random**
https://chat-gpt.org/chat
https://hf4all-bingo.hf.space/
https://chatgpt.ai/
https:…
-
Originally spotted by @iamlemec in https://github.com/abetlen/llama-cpp-python/issues/1089 reproduced with llama.cpp by passing `--no_kv_offload` to `./main`. Bug causes the model to generate repeated…
-
Hi Thanks for open-sourcing this work. I have problems reproducing the results in the paper, looking forward to your help.
I have replicated twice the multitask w/o pertaining experiments. However,…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
# 问题
转换完权重之后进行评估验证时出现下述问题
```shell
> number of parameters on (tensor, pipeline) model parallel rank (0, 0): 630167424
loading release checkpoint from /raid/LLM_train/Pai-Megatron-Patch/checkpoint…
-
This is, by a far extend, the best visual model I've seen. A huge step forward.
The one thing I am missing, as with many releases, is to consider the CPP framework (ggml/llama.cpp).
There is llava…
-
There seems to be a dependency issue.
```
Collecting locallm
Using cached locallm-0.3.0-py3-none-any.whl (16 kB)
Collecting sseclient-py
Using cached sseclient_py-1.8.0-py2.py3-none-any.whl…
-
I've run the instruction tuning bash script - but don't see a new checkpoint -- do you just overwrite the old checkpoint?
-
### background
1. We are trying to optimze `MUL_MAT` perf on `Arm chips`;
2. we build `llama.cpp` with `__ARM_FEATURE_FP16_VECTOR_ARITHMETIC=ON`, So the `GGML_F16` macro definitions would be implme…