I've run the quantization as described in README on the ggml_weights.bin for q4_0 and q8_0. The original ggml_weights.bin works on XCode and when I run it on my device, however, if I change the model to any of the quantized versions, it crashes upon launch. I've added print statements to the bark_model_load function in the package, and this is the output:
Entering bark_model_load function
Model hyperparameters:
n_layer: 12
n_head: 12
n_embd: 768
block_size: 1024
bias: 0
n_in_vocab: 129600
n_out_vocab: 10048
n_lm_heads: 1
n_wtes: 1
ftype: 2007
qntvr: 2
Weight type: 8
Estimated buffer size: 6547975168 bytes
Estimated number of tensors: 78
Creating ggml context with size: 27456
ggml context created successfully
Initializing backend
Using CPU backend
CPU backend initialized successfully
Allocating weights buffer of size 6547975168
Weights buffer allocated successfully
Memory prepared for weights
Preparing key + value memory
n_mem: 12288, n_elements: 9437184
Allocating KV cache for text and coarse encoder
Memory size for KV cache: 72.00 MB
KV cache allocated successfully
Key + value memory prepared
Loading 76 tensors
Loading tensor 'model/wte/0'
Dimensions: 2, Elements: 99532800, Type: 8
Tensor shape: [768, 129600]
Tensor elements: 99532800
Tensor size: 105753600 bytes
Allocator address: 0x0
Reading 105753600 bytes from file
Tensor 'model/wte/0' has null data pointer
bark_load_model_from_file: invalid model file '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin' (bad text)
bark_load_model: failed to load model weights from '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin'
Couldn't load model at /private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin
The operation couldn’t be completed. (Throwaway.BarkError error 0.)
Can't find or decode reasons
Failed to get or decode unavailable reasons
Can't find or decode disallowed use cases
Additional notes:
The quantized models run perfectly to generate .wav files if run through terminal, so they are not corrupted.
I've run the quantization as described in README on the ggml_weights.bin for q4_0 and q8_0. The original ggml_weights.bin works on XCode and when I run it on my device, however, if I change the model to any of the quantized versions, it crashes upon launch. I've added print statements to the bark_model_load function in the package, and this is the output:
Entering bark_model_load function Model hyperparameters: n_layer: 12 n_head: 12 n_embd: 768 block_size: 1024 bias: 0 n_in_vocab: 129600 n_out_vocab: 10048 n_lm_heads: 1 n_wtes: 1 ftype: 2007 qntvr: 2 Weight type: 8 Estimated buffer size: 6547975168 bytes Estimated number of tensors: 78 Creating ggml context with size: 27456 ggml context created successfully Initializing backend Using CPU backend CPU backend initialized successfully Allocating weights buffer of size 6547975168 Weights buffer allocated successfully Memory prepared for weights Preparing key + value memory n_mem: 12288, n_elements: 9437184 Allocating KV cache for text and coarse encoder Memory size for KV cache: 72.00 MB KV cache allocated successfully Key + value memory prepared Loading 76 tensors Loading tensor 'model/wte/0' Dimensions: 2, Elements: 99532800, Type: 8 Tensor shape: [768, 129600] Tensor elements: 99532800 Tensor size: 105753600 bytes Allocator address: 0x0 Reading 105753600 bytes from file Tensor 'model/wte/0' has null data pointer bark_load_model_from_file: invalid model file '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin' (bad text) bark_load_model: failed to load model weights from '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin' Couldn't load model at /private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin The operation couldn’t be completed. (Throwaway.BarkError error 0.) Can't find or decode reasons Failed to get or decode unavailable reasons Can't find or decode disallowed use cases
Additional notes: