-
When Megatron-DeepSpeed support llama3/llama3.1 pretraining?
-
| Dataset | Original Model | INT4 KV Cache | INT4 KV Cache & RA |
|--------------------|----------------|---------------|---------------------|
| hotpotqa | 45.09 | 41.…
-
### Check for existing issues
- [X] Completed
### Describe the bug / provide steps to reproduce it
When using cerebras.ai API, LLM output is able to crash zed. I configured it in settings.jso…
-
Does anyone know how to use llama3.1 or 3 in this addon.
I've tried downloading "Meta-Llama-3.1-8B-Instruct-Q2_K.gguf" from https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main…
-
Hello,
I currently use paperQA with Llama3.1:70b served by Ollama.
With LLM default parameters, answers quality is often poor, especially when I increase `answer.evidence_k` and `answer.answer_m…
-
### Bug Description
Hi Langflow team,
I am getting below error when I use the Ollama Component, and its pretty consistent.
![image](https://github.com/user-attachments/assets/6452a293-…
-
This occurs when using two GPUs, but it does not occur when I use just the one.
I made sure to update to the docker image used in the dockerfile.
commit: a702c6dd2944aaf75800b11f4dfeec6fe5a9b068…
-
I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:
"ERROR:root:Error occurred…
-
Hi, I have noticed that there was a huge difference in memory usage for runtime buffers and decoder for llama3 and llama3.1. Is it possible to know why?
I have built an 8bit quantised llama3 engine a…
-
Running the following command with Llama-3.1-8B-Instruct fails with a `AttributeError: 'function' object has no attribute 'pad_token'` error. I am using the adding_all_changess branch to replicate the…