alexrozanski / LlamaChat

Chat with your favourite LLaMA models in a native macOS app
https://llamachat.app
MIT License
1.45k stars 56 forks source link

Llamachat is spouting gibberish #45

Open jdblack opened 10 months ago

jdblack commented 10 months ago

System: Macbook Pro 2019 Installation method: Homebrew Version: ==> llamachat: 1.2.0 (auto_updates)

I downloaded the 7b via bittorrent and imported it into llamachat. I tried saying hello to my new friend, but all I get in return is gibbrerish. How do I debug this issue?

Screenshot 2023-11-16 at 12 30 49 PM

It appears to convert the model without any complaints:

python3 -u /var/folders/tj/hdvn6t_x1lb_27qt5h9xx5300000gn/T/6A1F66D4-970D-4452-9D5A-F8D5231D098F/convert-pth-to-ggml.py /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B 1 Loading model file /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/consolidated.00.pth Loading vocab file /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/tokenizer.model Writing vocab... [ 1/291] Writing tensor tok_embeddings.weight | size 32000 x 4096 | type UnquantizedDataType(name='F16') [ 2/291] Writing tensor norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 3/291] Writing tensor output.weight | size 32000 x 4096 | type UnquantizedDataType(name='F16') [ 4/291] Writing tensor layers.0.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 5/291] Writing tensor layers.0.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 6/291] Writing tensor layers.0.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 7/291] Writing tensor layers.0.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 8/291] Writing tensor layers.0.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 9/291] Writing tensor layers.0.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 10/291] Writing tensor layers.0.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 11/291] Writing tensor layers.0.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 12/291] Writing tensor layers.0.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 13/291] Writing tensor layers.1.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 14/291] Writing tensor layers.1.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 15/291] Writing tensor layers.1.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 16/291] Writing tensor layers.1.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 17/291] Writing tensor layers.1.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 18/291] Writing tensor layers.1.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 19/291] Writing tensor layers.1.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 20/291] Writing tensor layers.1.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 21/291] Writing tensor layers.1.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 22/291] Writing tensor layers.2.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 23/291] Writing tensor layers.2.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 24/291] Writing tensor layers.2.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 25/291] Writing tensor layers.2.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 26/291] Writing tensor layers.2.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 27/291] Writing tensor layers.2.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 28/291] Writing tensor layers.2.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 29/291] Writing tensor layers.2.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 30/291] Writing tensor layers.2.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 31/291] Writing tensor layers.3.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 32/291] Writing tensor layers.3.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 33/291] Writing tensor layers.3.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 34/291] Writing tensor layers.3.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 35/291] Writing tensor layers.3.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 36/291] Writing tensor layers.3.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 37/291] Writing tensor layers.3.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 38/291] Writing tensor layers.3.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 39/291] Writing tensor layers.3.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 40/291] Writing tensor layers.4.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 41/291] Writing tensor layers.4.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 42/291] Writing tensor layers.4.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 43/291] Writing tensor layers.4.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 44/291] Writing tensor layers.4.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 45/291] Writing tensor layers.4.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 46/291] Writing tensor layers.4.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 47/291] Writing tensor layers.4.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 48/291] Writing tensor layers.4.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 49/291] Writing tensor layers.5.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 50/291] Writing tensor layers.5.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 51/291] Writing tensor layers.5.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 52/291] Writing tensor layers.5.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 53/291] Writing tensor layers.5.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 54/291] Writing tensor layers.5.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 55/291] Writing tensor layers.5.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 56/291] Writing tensor layers.5.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 57/291] Writing tensor layers.5.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 58/291] Writing tensor layers.6.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 59/291] Writing tensor layers.6.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 60/291] Writing tensor layers.6.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 61/291] Writing tensor layers.6.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 62/291] Writing tensor layers.6.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 63/291] Writing tensor layers.6.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 64/291] Writing tensor layers.6.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 65/291] Writing tensor layers.6.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 66/291] Writing tensor layers.6.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 67/291] Writing tensor layers.7.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 68/291] Writing tensor layers.7.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 69/291] Writing tensor layers.7.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 70/291] Writing tensor layers.7.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 71/291] Writing tensor layers.7.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 72/291] Writing tensor layers.7.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 73/291] Writing tensor layers.7.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 74/291] Writing tensor layers.7.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 75/291] Writing tensor layers.7.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 76/291] Writing tensor layers.8.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 77/291] Writing tensor layers.8.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 78/291] Writing tensor layers.8.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 79/291] Writing tensor layers.8.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 80/291] Writing tensor layers.8.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 81/291] Writing tensor layers.8.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 82/291] Writing tensor layers.8.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 83/291] Writing tensor layers.8.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 84/291] Writing tensor layers.8.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 85/291] Writing tensor layers.9.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 86/291] Writing tensor layers.9.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 87/291] Writing tensor layers.9.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 88/291] Writing tensor layers.9.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 89/291] Writing tensor layers.9.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 90/291] Writing tensor layers.9.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 91/291] Writing tensor layers.9.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [ 92/291] Writing tensor layers.9.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [ 93/291] Writing tensor layers.9.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 94/291] Writing tensor layers.10.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 95/291] Writing tensor layers.10.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 96/291] Writing tensor layers.10.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 97/291] Writing tensor layers.10.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [ 98/291] Writing tensor layers.10.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [ 99/291] Writing tensor layers.10.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [100/291] Writing tensor layers.10.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [101/291] Writing tensor layers.10.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [102/291] Writing tensor layers.10.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [103/291] Writing tensor layers.11.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [104/291] Writing tensor layers.11.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [105/291] Writing tensor layers.11.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [106/291] Writing tensor layers.11.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [107/291] Writing tensor layers.11.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [108/291] Writing tensor layers.11.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [109/291] Writing tensor layers.11.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [110/291] Writing tensor layers.11.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [111/291] Writing tensor layers.11.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [112/291] Writing tensor layers.12.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [113/291] Writing tensor layers.12.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [114/291] Writing tensor layers.12.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [115/291] Writing tensor layers.12.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [116/291] Writing tensor layers.12.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [117/291] Writing tensor layers.12.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [118/291] Writing tensor layers.12.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [119/291] Writing tensor layers.12.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [120/291] Writing tensor layers.12.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [121/291] Writing tensor layers.13.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [122/291] Writing tensor layers.13.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [123/291] Writing tensor layers.13.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [124/291] Writing tensor layers.13.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [125/291] Writing tensor layers.13.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [126/291] Writing tensor layers.13.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [127/291] Writing tensor layers.13.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [128/291] Writing tensor layers.13.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [129/291] Writing tensor layers.13.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [130/291] Writing tensor layers.14.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [131/291] Writing tensor layers.14.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [132/291] Writing tensor layers.14.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [133/291] Writing tensor layers.14.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [134/291] Writing tensor layers.14.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [135/291] Writing tensor layers.14.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [136/291] Writing tensor layers.14.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [137/291] Writing tensor layers.14.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [138/291] Writing tensor layers.14.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [139/291] Writing tensor layers.15.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [140/291] Writing tensor layers.15.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [141/291] Writing tensor layers.15.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [142/291] Writing tensor layers.15.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [143/291] Writing tensor layers.15.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [144/291] Writing tensor layers.15.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [145/291] Writing tensor layers.15.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [146/291] Writing tensor layers.15.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [147/291] Writing tensor layers.15.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [148/291] Writing tensor layers.16.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [149/291] Writing tensor layers.16.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [150/291] Writing tensor layers.16.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [151/291] Writing tensor layers.16.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [152/291] Writing tensor layers.16.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [153/291] Writing tensor layers.16.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [154/291] Writing tensor layers.16.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [155/291] Writing tensor layers.16.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [156/291] Writing tensor layers.16.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [157/291] Writing tensor layers.17.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [158/291] Writing tensor layers.17.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [159/291] Writing tensor layers.17.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [160/291] Writing tensor layers.17.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [161/291] Writing tensor layers.17.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [162/291] Writing tensor layers.17.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [163/291] Writing tensor layers.17.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [164/291] Writing tensor layers.17.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [165/291] Writing tensor layers.17.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [166/291] Writing tensor layers.18.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [167/291] Writing tensor layers.18.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [168/291] Writing tensor layers.18.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [169/291] Writing tensor layers.18.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [170/291] Writing tensor layers.18.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [171/291] Writing tensor layers.18.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [172/291] Writing tensor layers.18.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [173/291] Writing tensor layers.18.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [174/291] Writing tensor layers.18.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [175/291] Writing tensor layers.19.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [176/291] Writing tensor layers.19.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [177/291] Writing tensor layers.19.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [178/291] Writing tensor layers.19.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [179/291] Writing tensor layers.19.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [180/291] Writing tensor layers.19.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [181/291] Writing tensor layers.19.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [182/291] Writing tensor layers.19.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [183/291] Writing tensor layers.19.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [184/291] Writing tensor layers.20.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [185/291] Writing tensor layers.20.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [186/291] Writing tensor layers.20.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [187/291] Writing tensor layers.20.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [188/291] Writing tensor layers.20.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [189/291] Writing tensor layers.20.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [190/291] Writing tensor layers.20.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [191/291] Writing tensor layers.20.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [192/291] Writing tensor layers.20.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [193/291] Writing tensor layers.21.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [194/291] Writing tensor layers.21.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [195/291] Writing tensor layers.21.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [196/291] Writing tensor layers.21.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [197/291] Writing tensor layers.21.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [198/291] Writing tensor layers.21.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [199/291] Writing tensor layers.21.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [200/291] Writing tensor layers.21.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [201/291] Writing tensor layers.21.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [202/291] Writing tensor layers.22.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [203/291] Writing tensor layers.22.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [204/291] Writing tensor layers.22.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [205/291] Writing tensor layers.22.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [206/291] Writing tensor layers.22.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [207/291] Writing tensor layers.22.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [208/291] Writing tensor layers.22.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [209/291] Writing tensor layers.22.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [210/291] Writing tensor layers.22.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [211/291] Writing tensor layers.23.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [212/291] Writing tensor layers.23.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [213/291] Writing tensor layers.23.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [214/291] Writing tensor layers.23.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [215/291] Writing tensor layers.23.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [216/291] Writing tensor layers.23.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [217/291] Writing tensor layers.23.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [218/291] Writing tensor layers.23.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [219/291] Writing tensor layers.23.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [220/291] Writing tensor layers.24.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [221/291] Writing tensor layers.24.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [222/291] Writing tensor layers.24.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [223/291] Writing tensor layers.24.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [224/291] Writing tensor layers.24.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [225/291] Writing tensor layers.24.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [226/291] Writing tensor layers.24.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [227/291] Writing tensor layers.24.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [228/291] Writing tensor layers.24.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [229/291] Writing tensor layers.25.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [230/291] Writing tensor layers.25.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [231/291] Writing tensor layers.25.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [232/291] Writing tensor layers.25.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [233/291] Writing tensor layers.25.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [234/291] Writing tensor layers.25.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [235/291] Writing tensor layers.25.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [236/291] Writing tensor layers.25.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [237/291] Writing tensor layers.25.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [238/291] Writing tensor layers.26.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [239/291] Writing tensor layers.26.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [240/291] Writing tensor layers.26.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [241/291] Writing tensor layers.26.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [242/291] Writing tensor layers.26.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [243/291] Writing tensor layers.26.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [244/291] Writing tensor layers.26.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [245/291] Writing tensor layers.26.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [246/291] Writing tensor layers.26.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [247/291] Writing tensor layers.27.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [248/291] Writing tensor layers.27.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [249/291] Writing tensor layers.27.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [250/291] Writing tensor layers.27.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [251/291] Writing tensor layers.27.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [252/291] Writing tensor layers.27.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [253/291] Writing tensor layers.27.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [254/291] Writing tensor layers.27.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [255/291] Writing tensor layers.27.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [256/291] Writing tensor layers.28.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [257/291] Writing tensor layers.28.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [258/291] Writing tensor layers.28.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [259/291] Writing tensor layers.28.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [260/291] Writing tensor layers.28.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [261/291] Writing tensor layers.28.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [262/291] Writing tensor layers.28.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [263/291] Writing tensor layers.28.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [264/291] Writing tensor layers.28.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [265/291] Writing tensor layers.29.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [266/291] Writing tensor layers.29.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [267/291] Writing tensor layers.29.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [268/291] Writing tensor layers.29.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [269/291] Writing tensor layers.29.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [270/291] Writing tensor layers.29.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [271/291] Writing tensor layers.29.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [272/291] Writing tensor layers.29.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [273/291] Writing tensor layers.29.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [274/291] Writing tensor layers.30.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [275/291] Writing tensor layers.30.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [276/291] Writing tensor layers.30.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [277/291] Writing tensor layers.30.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [278/291] Writing tensor layers.30.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [279/291] Writing tensor layers.30.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [280/291] Writing tensor layers.30.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [281/291] Writing tensor layers.30.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [282/291] Writing tensor layers.30.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [283/291] Writing tensor layers.31.attention.wq.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [284/291] Writing tensor layers.31.attention.wk.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [285/291] Writing tensor layers.31.attention.wv.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [286/291] Writing tensor layers.31.attention.wo.weight | size 4096 x 4096 | type UnquantizedDataType(name='F16') [287/291] Writing tensor layers.31.attention_norm.weight | size 4096 | type UnquantizedDataType(name='F32') [288/291] Writing tensor layers.31.feed_forward.w1.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [289/291] Writing tensor layers.31.feed_forward.w2.weight | size 4096 x 11008 | type UnquantizedDataType(name='F16') [290/291] Writing tensor layers.31.feed_forward.w3.weight | size 11008 x 4096 | type UnquantizedDataType(name='F16') [291/291] Writing tensor layers.31.ffn_norm.weight | size 4096 | type UnquantizedDataType(name='F32') Wrote /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/ggml-model-f16.bin

test -f /Users/jblack/Library/Application Support/com.alexrozanski.LlamaChat/models/47EECB0E-23AA-4001-A0F3-9548E6C73A71/7B/ggml-model-f16.bin

jdblack commented 10 months ago

I suspect this is due to lack of free memory.