Error when running the llama-bench tool on a dummy model, as specified in readme. Failed assertion in ggml.c relating to tile number for parallel processing.
To diagnose, I added print statements after line 12695 of ggml.c. These showed that the ne0 value is as expected, but the n_tile_num value is 0, leading to the assertion error.
I tested with model sizes 125M and 350M. The only instance when the program doesn't crash is when creating a new model type '700M' with identical parameters to the original bitnet_b1_58_large model.
Description
Error when running the llama-bench tool on a dummy model, as specified in readme. Failed assertion in ggml.c relating to tile number for parallel processing.
Steps to Reproduce
Erroneous output
OS: macOS (M2 Pro MacBook Pro) Python Version: Python 3.9.20 Repository Commit:
bf11a49f11b9d0535285cc4cdec834a28762ed87
Further Exploration
To diagnose, I added print statements after line 12695 of ggml.c. These showed that the
ne0
value is as expected, but then_tile_num
value is 0, leading to the assertion error.I tested with model sizes 125M and 350M. The only instance when the program doesn't crash is when creating a new model type '700M' with identical parameters to the original
bitnet_b1_58_large
model.