-
There are 4 samples in the reference HF output that has no output other than the EOS.
```
>>> df = pd.read_pickle("06062024_mixtral_15k_v4.pkl")
>>> df[df['tok_ref_output_len'] == 1]
datase…
-
### Describe the bug
This notebook requires use of filter dictionary to obtain information associated with building the custom model. Despite the efforts discussed below, I am not able to successfull…
-
Hello mlcommons team,
I want to run the "Automated command to run the benchmark via MLCommons CM" (from the example: https://github.com/mlcommons/inference/tree/master/language/llama2-70b), but I a…
-
When running models using the cli the whole names needs to be used e.g. `ollama run deepseek-coder-v2`. Some of these names are hard to remember. I often copy them from `ollama list`. What if we could…
-
On my old 4-core computer, I got roughly 2.1 t/s with openorca-chat model. When I upgraded to a 6-core PC, the speed doubled. Other models got the same speed increase. Then two things happened at the …
-
Something interesting occurred while upgrading to version 1.8.0. Previously, it had been throwing an "Out of Memory" error, but that issue has now been resolved. However, a new problem has surfaced, w…
-
How are you downloading Mistral-7B-OpenOrca model, ikeep getting this error:
OSError: Incorrect path_or_model_id: '/media/2nvme/llm/Mistral-7B-OpenOrca'. Please provide either the path to a local f…
AFMSB updated
6 months ago
-
As far as I know, Megatron-LM requires the input sequence length to be fixed and padded to the `--seq-length`. However, for some SFT datasets like [tatsu-lab/alpaca](https://huggingface.co/datasets/ta…
-
Some datasets*, for example [vivym/midjourney-messages](https://huggingface.co/datasets/vivym/midjourney-messages) and [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) are not …
-
I attempted to download mistral-7b-openorca.Q4_0.gguf multiple times. The download completed, but then different error states happened randomly:
* the download button returned to its original configu…