kuvaus / LlamaGPTJ-chat

Simple chat program for LLaMa, GPT-J, and MPT models.
MIT License
217 stars 50 forks source link

Can't compile, llamamodel errors. #9

Closed mkultra333 closed 1 year ago

mkultra333 commented 1 year ago

Hi, I really like your project because I need to run mpt-7b on windows. I might be doing this wrong, but I've can't get it to compile.

The gpt4all-backend\llama.cpp folder is empty so I assume I'm supposed to put a copy in there. I've put in the latest llama.cpp project in there. cmake runs fine and sets up the build.

I know llama.cpp keeps changing, and when I try to compile I get the following errors (I've skipped most the output and warnings):

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\gptj.cpp(223,25): error C2065: 'GGML_TYPE_Q4_2': undeclared identifier

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\llamamodel.cpp(52,19): error C2039: 'n_parts': is not a member of 'llama_context_params'

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\llamamodel.cpp(52,39): error C2039: 'n_parts': is not a member of 'gpt_params'

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\llamamodel.cpp(100,12): error C2664: 'size_t llama_set_state_data(llama_context ,uint8_t )': cannot convert argument 2 from 'const uint8_t ' to 'uint8_t '

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\llamamodel.cpp(184,26): error C3861: 'llama_sample_top_p_top_k': identifier not found

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\mpt.cpp(230,25): error C2065: 'GGML_TYPE_Q4_2': undeclared identifier

4>C:\Projects\LlamaGPTJ_chat_00\gpt4all-backend\mpt.cpp(553,53): error C2660: 'ggml_alibi': function does not take 4 arguments

4>Done building project "llmodel.vcxproj" -- FAILED.

5>LINK : warning LNK4044: unrecognized option '/static-libgcc'; ignored 5>LINK : warning LNK4044: unrecognized option '/static-libstdc++'; ignored 5>LINK : warning LNK4044: unrecognized option '/static'; ignored 5>LINK : fatal error LNK1181: cannot open input file '..\gpt4all-backend\Release\llmodel.lib' 5>Done building project "chat.vcxproj" -- FAILED. 6>------ Skipped Rebuild All: Project: ALL_BUILD, Configuration: Release x64 ------ 6>Project not selected to build for this solution configuration ========== Rebuild All: 3 succeeded, 2 failed, 1 skipped ==========


Do you have any plans to update the project, and if not, could you give many any tips for fixing the code? I'm really not familiar with how any of this stuff works so I'm fumbling around in the dark at the moment.

kuvaus commented 1 year ago

Thanks, it's really good to have Windows testers. :)

I have a guess on whats wrong:

First, did you add the --recurse-submodules when cloning the git? Without that it won't load the llama.cpp into the backend folder. So like this:

git clone --recurse-submodules https://github.com/kuvaus/LlamaGPTJ-chat

If you had already cloned the repo, this might also work (it should download the llama.cpp):

git submodule update --remote

The real issue is that you're compiling everything correctly but the backend currently uses a slightly older version of llama.cpp. This was done so that MPT models would load.

Now, I think the latest mainline llama.cpp did just get MPT support too but the backend is not yet 100% compatible with it. There is progress to make it work and I'll try to update the backend here too when the gpt4all people get it working.

So for now, try to update the llama with the above commands or copy the older version of llama.cpp into the folder instead of the newest and it should (hopefully :) work.

mkultra333 commented 1 year ago

Thanks, I'll give that a shot.

mkultra333 commented 1 year ago

Also, forgot to ask, what models am I supposed to use for mpt-7b-instruct? I've been trying to get a lot of different projects to work and half the time the models don't load. I know the models keep changing, especially the GGML models.

mkultra333 commented 1 year ago

I followed your instructions and it compiled. I gave it a model to use (ggml-mpt-7b-instruct.bin from nomic) and it loaded it, but then crashed when I typed in a prompt. I tried with the original llama.cpp it downloaded, and with the other version you linked to, but got the same either way. Here was the console output:

C:\Projects\llamaGPTJ\LlamaGPTJ-chat\build\bin\Debug>chat -m "C:/Projects/models/ggml-mpt-7b-instruct.bin" -t 4 LlamaGPTJ-chat (v. 0.2.1) LlamaGPTJ-chat: loading C:/Projects/models/ggml-mpt-7b-instruct.bin mpt_model_load: loading model from 'C:/Projects/models/ggml-mpt-7b-instruct.bin' - please wait ... mpt_model_load: n_vocab = 50432 mpt_model_load: n_ctx = 2048 mpt_model_load: n_embd = 4096 mpt_model_load: n_head = 32 mpt_model_load: n_layer = 32 mpt_model_load: alibi_bias_max = 8.000000 mpt_model_load: clip_qkv = 0.000000 mpt_model_load: ftype = 2 mpt_model_load: ggml ctx size = 5653.09 MB .mpt_model_load: kv self size = 1024.00 MB ......................... done......... mpt_model_load: model size = 4629.02 MB / num tensors = 194 LlamaGPTJ-chat: done loading!

test :Assertion failed: ggml_nelements(src1) == 2, file C:\Projects\llamaGPTJ\LlamaGPTJ-chat\gpt4all-backend\llama.cpp\ggml.c, line 9342 : C:\Projects\llamaGPTJ\LlamaGPTJ-chat\build\bin\Debug>

Edit: I tried ggml-gpt4all-j.bin and that ran okay, so maybe it just doesn't like mpt-7b.

Edit: Release mode seems to run ggml-mpt-7b-instruct.bin fine, no crashes so far.

Thanks a lot for your help! I've been trying to compile at least one of these kinds of projects for days without any success. This one seems to run very fast compared to GPT4ALL.

kuvaus commented 1 year ago

Also, forgot to ask, what models am I supposed to use for mpt-7b-instruct?

@MBCX kindly posted a link to all the supported models here: https://github.com/kuvaus/LlamaGPTJ-chat/issues/8#issuecomment-1567640695 I haven't personally tested them all but they should all work. The couple links in the README.md also point to those same models, but that link has the more complete list.

I followed your instructions and it compiled. I gave it a model to use (ggml-mpt-7b-instruct.bin from nomic) and it loaded it, but then crashed when I typed in a prompt.
Edit: I tried ggml-gpt4all-j.bin and that ran okay, so maybe it just doesn't like mpt-7b.

Nice find on the Debug mode error! That is an bug in the program so I'll look into it. Thanks!

Edit: Release mode seems to run ggml-mpt-7b-instruct.bin fine, no crashes so far.

Glad that the Release mode now works without errors.