-
I manually downloaded the model and set the model with the command "python setup_env.py -md .\models\Llama3-8B-1.58-100B-tokens -q i2_s" in Windows 11 OS. The result shows:
"ERROR:root:Error occurred…
-
When the whisper model is loaded, it prints a lot of initialization information to the console. I'd like to be able to redirect this to a separate log file and silence the console output.
`llama-c…
-
Following the build instructions in the readme,
```
cmake .. -G "Ninja" -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DSD_HIPBLAS=ON -DCMAKE_BUILD_TYPE=Release -DAMDGPU_TARGETS=gfx1100
…
-
在运行rkllm时,user内容输入后,robot回答报错:
robot: :0: GGML_ASSERT(view_src == NULL || data_size == 0 || data_size + view_offs
-
As per recent discussions (e.g. https://github.com/ggerganov/llama.cpp/pull/10144#pullrequestreview-2411814357), we should split the large `ggml-cpu.c` implementation into smaller modules - similar to…
-
I downloaded the weights from https://huggingface.co/shuttleai/shuttle-3-diffusion, the program loaded the weights and exit for no error message.
I debugged the program, it seems that the problem i…
-
The usual behavior for the "mean" operation in numerical frameworks is a reduction of a tensor to a single value. However, in GGML this operation instead calculates the mean *per row*. This is I think…
-
**Is your feature request related to a problem? Please describe.**
Currently binary TTNN operators follows the NumPy broadcasting rules. i.e. only dimensions of (implied) 1 can be broadcasted. Eg the…
-
~~When cross-compiling for Android using NDK toolchain, Flash Attention fails to build in CPU-only mode but succeeds when Vulkan backend is enabled, despite being documented as CPU-only feature.~~
…
-
### Problem Description
Adding New gfx model gfx1151 to Linux , it can build on Linux also I can build the llama cpp with rocWMMA patch
https://github.com/ggerganov/llama.cpp/pull/7011/commits to …