-
In the GUI of the desktop app: The model does not see the system prompt at all.
Tested with many models.
In the photo you can see the system prompt used and the answer from the model.
While when …
-
I'm running h2ogpt on an Ubuntu server, you'll find attached the server specifications. However, the model execution is too slow (TheBloke/Mistral-7B-Instruct-v0.2-GGUF), and sometimes it doesn't even…
-
Looking both at foundation and instruction tuning models. For this project, the latter is probably going to be the only target, as it would probably work better.
**Available**
- Flan-T5: https://…
rbroc updated
3 weeks ago
-
### Feature request
Now that the context window is variable (per #1668) it would be helpful to have models2.json updated to populate the n_ctx field along with the correct system and user prompts.
…
-
1. https://github.com/ggerganov/llama.cpp/blob/master/convert-lora-to-ggml.py - this should run together with adapter_config.json, adapter_model.bin and config.json. It will spit out `ggml-adapter-mod…
-
### 🐛 Describe the bug
```
torchrun --nproc-per-node 8 dist_run.py
```
```
known configs: ['13B', '30B', '34B', '70B', '7B', 'CodeLlama-7b-Python-hf', 'Mistral-7B', 'stories110M', 'stories15M',…
-
Hi again. I'm wondering, does funcchain handle different chat templates internally?
e.g. for Llama
```
[INST]
You are a helpful, respectful and honest assistant. Always answer as helpfully as p…
-
### Your current environment
```text
The output of `python collect_env.py`
```
```
Collecting environment information...
/scratch/gpfs/hyen/p-long-instruct/vllm_env/lib/python3.11/site-package…
-
### Description
As mentioned by @IPostYellow in #121, the KV blending module cannot correctly retrieve the chunks when running the example. See the details below:
------------- Below are how to …
-
### Is your feature request related to a problem? / 你想要的功能和什么问题相关?
https://replicate.com/nateraw/mixtral-8x7b-32kseqlen
### Describe the solution you'd like. / 你想要的解决方案是什么?
https://replicate.com/na…