Open Nocturna22 opened 1 year ago
Dalai is currently having issues with installing the llama model, as there are issues with the PowerShell script. I think it is related to #241. Try downloading alpaca.7B as an alternative, it should at least work and give you some output.
Dalai is currently having issues with installing the llama model, as there are issues with the PowerShell script. I think it is related to #241. Try downloading alpaca.7B as an alternative, it should at least work and give you some output.
Thanks for your answer, but I'm trying with Alpaca and not LLama :C Maybe it doesn't work with Alpaca now either?
It seems to be something different. I tried to follow these steps: https://github.com/cocktailpeanut/dalai/issues/245#issuecomment-1481806448 But main.exe already exists in the folder.
Have you tried running main.exe from the command-line with the parameters supplied?
If it responds right away without a response to your prompt, try: echo %errorlevel%
Have you tried running main.exe from the command-line with the parameters supplied?
If it responds right away without a response to your prompt, try: echo %errorlevel%
Thanks for the hint. But unfortunately I do not know which parameters I can pass. I don't know how to google either. If I start main.exe without parameters and then output the errorlevel, I get the following integer back:
-1073741795
Segmentation fault. I had the same issue. Don't have a fix.
Same, I have this kind of error with Llama and Alpaca 30B :
llama_model_load: ggml ctx size = 21450.50 MB Segmentation fault exit
I use docker and this error is related to a lake of RAM I guess... I have about 22go RAM free before running the model, sometimes less, so maybe 32go of ram isn't enough to run 30B models with docker ? The vmmem
process takes about 8-10go of ram before running model :thinking:
Same, I have this kind of error with Llama and Alpaca 30B :
llama_model_load: ggml ctx size = 21450.50 MB Segmentation fault exit
I use docker and this error is related to a lake of RAM I guess... I have about 22go RAM free before running the model, sometimes less, so maybe 32go of ram isn't enough to run 30B models with docker ? The
vmmem
process takes about 8-10go of ram before running model 🤔
Hmmmm on this site are the requirements:
2. Memory Requirements
Runs on most modern computers. Unless your computer is very very old, it should work.
According to https://github.com/ggerganov/llama.cpp/issues/13, here are the memory requirements:
7B => ~4 GB
13B => ~8 GB
30B => ~16 GB
65B => ~32 GB
EDIT: Maby this helps someone (i hope the link works for you..):
https://www.phind.com/search?cache=6a6b46d4-d929-49b7-bbdd-2d3cca237077
I would get it regardless of model (tried 7B and 13B) and I have 32GB on my machine. And it worked fine running under a Linux VM with only 16GB. So the problem appears to be specific to the Windows version.
@pdavis68 , virtualization 🤣 ☕
It's running flawlessly on my Linux environment, but I've been having trouble getting it to work on my Windows 11 PC with 6 GB of V-RAM and 16 GB of RAM.
Problem description:
alpaca
, although I've downloaded both.split(".")
issueThe crashes appear because of programming errors: They don't check whether they get memory for the model or not ;-)
In ggml.c Line 2450 it's only checked against alignment, but not whether they have the memory at all ... ggml_assert_aligned(ctx->mem_buffer);
Add in line 2450: assert(ctx->mem_buffer); Or print out an error: if (NULL==ctx->mem_buffer){printf("Not enough memory to store model\n");exit(-1);}
The description on the main-page in GitHub is wrong. The model does not have to reside where the .exe is, but where you start it from. So if you build with CMAKE within the build directory the model has to reside in the "build" directory itself, even if the executable is in "Release/main". The .exe is NOT name "chat" but "main" ... etcpp.
So (Windows, have CMAKE installed and VisualStudio2019 for example):
In linux I have everything running on the same machine. I think I will stay with linux... But if I ever switch to windows I will try that :) Thank you.
(And yes... with 4 cores it is really pathetic :P I hope there will be a better quantized model sometime. Then I will test it again with the maschiene. But so 4 cores with the clock speed are just too weak. (But with 24 cores it goes really fast xD))
I dont know if i can close this issue now because i cant test it. Those who have the same problem: Please report.
I am running on Windows 10. When I had segmentation faults it was because I was trying to use a model that was too big (30B).
I can run the 13B one with a 32GB Ram and 16GB video card memory. It never seems to use the videocard memory though. It uses up to about 24GB of the Ram when it's working... and about 16GB when idling for the whole system. And about 4 GB when idling for just the vmmem and about 13 or 14gb when working. So I think a 16GB Ram Windows PC will not work with the 13B Model, but should with the 7B Model.
If you get Segmentation Faults with the 7B model, it likely is the hardware is what different AIs told me.
I get the segmentation fault under Windows with the 7B model. Using the same machine, running it in a linux VM with only 16GB of the 32GB allocated to it, it runs fine. So it's not a hardware issue for me.
In linux I have everything running on the same machine. I think I will stay with linux... But if I ever switch to windows I will try that :) Thank you.
(And yes... with 4 cores it is really pathetic :P I hope there will be a better quantized model sometime. Then I will test it again with the maschiene. But so 4 cores with the clock speed are just too weak. (But with 24 cores it goes really fast xD))
I dont know if i can close this issue now because i cant test it. Those who have the same problem: Please report.
So you were able to get it running in Linux? I've tried Win 10 but I'm spinning up an Ubuntu 20.04 to try.
Its a long procedure but bear with me.
OS: Windows 10
Make sure you have enough storage space to download Visual Studio, the models and any other dependencies. Have at least 25 GB. I recommend 50 GB. Make sure you have enough RAM to run the model. Have at least 8 GB total. I recommend 12 GB for the 7B model. Make sure your CPU has enough threads. Have at least 4 threads. I recommend 12. Make sure your internet connection is good during the process. Otherwise, the system may throw a tantrum like a 6 year old. Make sure the computer doesn't sleep, set the sleep time to 'Never'. Otherwise, the system may throw a tantrum like a 6 year old. Make sure to have enough patience to not interrupt the download/installs. Otherwise, the system may throw a tantrum like a 6 year old. Make sure the system doesn't throw a tantrum like a 6 year old. Anyway, that being said, below are the steps.
Remove all files, folders and any other references to 'Dalai', 'Llama', 'Alpaca', etc. Delete all of it. Make sure when you search any of those terms, nothing shows up. Also type 'dalai' and 'npx dalai' in command prompt and make sure this command is NOT recognized. Also remove/uninstall Visual Studio if you have it installed.
Check your Python and Node versions.
Type in command prompt:
python -V
node --version
Make sure Python is below 3.10 (<3.10) and Node is above 18 (>18)
If that's not the case, uninstall them and install versions that meet these requirements.
Python Download: https://www.python.org/downloads/release/python-398/
Node Download: https://nodejs.org/en/download
This is a requirement. Install Visual Studio (Latest version) VS Download: https://visualstudio.microsoft.com/downloads/ IMPORTANT While downloading, select these options (Python, Node, C++):
Proceed to download, sign in to your Microsoft account on VS and make sure all your Visual Studio tools are up to date.
This is what fixed it for me.
Open Command Prompt as Administrator.
Type in:
npm install node-pty
Open Command Prompt as Administrator. (Make sure you are in CMD and not powershell)
Type in:
npx dalai alpaca install 7B
npx dalai llama install 7B
(or whatever other model you need, like 13B, 30B, etc.)
DO NOT interrupt the install in any way. Otherwise the system may throw tantrums.
Type in command prompt:
npx dalai serve
It should begin at http://localhost:3000/
DO NOT open the link (http://localhost:3000/) in your normal browser. ONLY USE INCOGNITO This is because the program seems to have problems when your browser has an active history or something like that (still not entirely sure why)
When you open the WebUI, select alpaca.7B from the model list (even though its already selected, click and re-select it)
To test if the prompts are working, don't change any settings and type in a small and easy prompt such as:
today's date is
(make sure to delete the template prompt before typing this in)
Depending on your processing power, you should get an output within 2 minutes at most (Ideally within 30 seconds)
If this works, you can now tweak settings and prompts to your liking :)
If option 1 doesn't work, just use a linux computer. The process is MUCH simpler (just 2 lines of code) and doesn't throw any weird errors. I was able to get it running on my low powered Linux laptop within minutes.
If none of the previous options work, pray to the UwU Gods for a miracle and try again.
So you were able to get it running in Linux? I've tried Win 10 but I'm spinning up an Ubuntu 20.04 to try.
Yes i was able to run in on Linux (debian/ubuntu based Distro: ZorinOS). I've made a dualboot and i think i will test it in windows right now with the 2 solutions presented here. Then i will report :)
Why dual boot? Wouldn't it be easier to use a VM under HyperV? Unless memory is the issue.
@Atharva2628 This did not work for me. Same error.
@jesko42 Your solution also does not work for me.
@pdavis68 The laptop only has 4 weak cores. I wanted to stack as little load as possible. CPU is really fast at 100% in Windows.
I dont know why i forget about docker all the time. I used Docker (wsl) to get it running on that machine. But its reeeeeeeally slow. In Linux its much faster.
@Atharva2628 This did not work for me. Same error.
@jesko42 Your solution also does not work for me.
@pdavis68 The laptop only has 4 weak cores. I wanted to stack as little load as possible. CPU is really fast at 100% in Windows.
I dont know why i forget about docker all the time. I used Docker (wsl) to get it running on that machine. But its reeeeeeeally slow. In Linux its much faster.
I'm going to be installing it on Ubuntu 20.04 later this evening. It's going to be a VM on a Win10 host. I will report back on that install.
Hey all: to be NOT misunderstood: I told you this "The crashes appear because of programming errors" My "solution" is only to SHOW if you have too less memory! By either "assert" the error in Debug-modes or write out a line of text and exit in this case !!!
I cannot help if you have too less memory. To me questionable in the end is why there is NO USE of virtual memory ... I could investigate here, but not yet as I'm currently very busy ;-) sorry
Hey all: to be NOT misunderstood: I told you this "The crashes appear because of programming errors" My "solution" is only to SHOW if you have too less memory! By either "assert" the error in Debug-modes or write out a line of text and exit in this case !!!
I cannot help if you have too less memory. To me questionable in the end is why there is NO USE of virtual memory ... I could investigate here, but not yet as I'm currently very busy ;-) sorry
All good, i know that. But after compiling i got some errors and i couldnt even start dalai. (Probably i made a mistake, but i dont really think its related to that, because the docker image works.. sooo it has to be something else. And others say that they have enough RAM in any case.)
Hello Dalai community and developers :)
Short specs: Mini Laptop:
Software versions:
Story:
Expected behavier:
Acual behavier:
I started dalai with "npx -dd dalai serve" to get more info. Here what the CMD window says:
This is the Browseroutput in Firefox with Debug enabled:
Debug log
Is my Hardware too weak? Or any hints? I don't get any error messages, so I don't know what to do either.... Can someone please tell me how I can further Troubleshoot?
(: