-
I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:
"ERROR:root:Error occurred…
-
## Description
- Unsupported data format during lowering from TTForge to TTIR: Bfp2_b. unsupported data fromat assertion from lower_to_mlir.cpp
`RuntimeError: TT_ASSERT @ /proj_sw/user_dev/mramanath…
-
LongViLa-LLama3-1024Frames output is often repetitive. Why does this happen, and are there any suggestions to reduce the repetition?
-
**Is your feature request related to a problem? Please describe.**
I was taking a look into Karapthy's lama3.c single file and found something similar in java
https://github.com/mukel/llama3.j…
-
To get this to work, first you have to get an external AMD GPU working on Pi OS. The most up-to-date instructions are currently on my website: [Get an AMD Radeon 6000/7000-series GPU running on Pi 5](…
-
I encountered an issue while running the vqav2_test task on the liuhaotian/llava-v1.5-7b model. The command was executed on a setup with 32 CPUs and 7 RTX A6000 GPUs, but it failed with a subprocess.C…
-
Fix the following runtime error:
```
---------------------------------------------------------------------------
[2024-11-20T15:20:17.295073], Orchestrator (error):
Failed to parse ledger inf…
-
# Motivation
I wanted to participate more in solving the listed issues, but I already spent more than $30 on debugging with the ChatGPT API, lol.
Recently, Mistral announced that they have reduced…
-
The Llama3 shared codebase demo currently handles prefill input prep, looped prefill, decode input prep, decode trace capture, and decode trace execution.
The Llama3 demo should be refactored to use …
-
we load InternVL2-Llama3-76B in 8-bit on 4 RTX3090, it takes 1minute or longer time to process 500 tokens prompt and 100 tokens output, is this speed normal?