-
### Describe the bug
Model URL:
https://huggingface.co/bartowski/Hubble-4B-v1-GGUF/discussions/1
llama_model_loader: - kv 26: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ"…
-
### What happened?
I am trying to run Qwen2-57B-A14B-instruct, and I used llama-gguf-split to merge the gguf files from [Qwen/Qwen2-57B-A14B-Instruct-GGUF](https://huggingface.co/Qwen/Qwen2-57B-A14B-…
-
## Problem
- user downloads hugging face model using author/model_id, eg
```
cortex pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF
```
- In `cortex models list`, the alias is unexpectedly differ…
-
### What happened?
Chat template formatting seems to be swapped for Mistral and Llama 2.
Llama2 supports the `` token for system messages, while Mistral simply uses newlines.
Starting llama ser…
-
So when I did fine-tuning of a llama3, my configuration file looks like:
```
# Tokenizer
tokenizer:
_component_: torchtune.models.llama3.llama3_tokenizer
path: ~/meta-llama/Meta-Llama-3-8B-In…
-
The current model implementation is using a sax parser and a ModelHandler. This implementation is not stable. In complex model situations duplicate event ids can occur. This is not detected by the `o…
-
Hi,
I am following the article at https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/pytorch-llama/
but at step
```
python torchchat.py export llama3.1 --output-dso-p…
-
| --- | --- |
| Bugzilla Link | [510503](https://bugs.eclipse.org/bugs/show_bug.cgi?id=510503) |
| Status | NEW |
| Importance | P3 major |
| Reported | Jan 16, 2017 07:55 EDT |
| Modified | Sep …
-
### What happened?
If you pass `tfs_z` param to the server, it crashes sometimes.
Starting the server:
```
~/test/llama.cpp/llama-server -m /opt/models/text/gemma-2-27b-it-Q8_0.gguf --verbose
`…
-
Hi, I am having an issue with running the sample example in the [quickstart guide](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md#3-example-runnin…