abdeladim-s / pyllamacpp

Python bindings for llama.cpp
https://abdeladim-s.github.io/pyllamacpp/
MIT License
62 stars 21 forks source link

Illegal Instruction (core dumped) even after disabling AVX2 and FMA #18

Open CyberSinister opened 1 year ago

CyberSinister commented 1 year ago

Hi, I'm very new to all of this and pyllamacpp so I'm sorry in advance if the details provided in this issue aren't good enough or up to par but I've been having some issues when doing: python -c 'from pyllamacpp.model import Model'

I know this has something do with my CPU and I've also followed this guide exactly: https://github.com/nomic-ai/pygpt4all/issues/71. I have an older server machine with 2 Intel Xeon X5670.

How do I figure out what's going on and how do I fix it?

abdeladim-s commented 1 year ago

Hi @CyberSinister, It's ok, I know this illegal instruction error is very annoying!

CyberSinister commented 1 year ago

Hey @abdeladim-s , thank you so much for such a quick and understanding response, I really appreciate it. Here are the details you asked for:

I later thought I should leave AVX2 and FMA off and keep only AVX turned on but that didn't work either.

Another thing I must add is that I'm using Ubuntu on Hyper-V. I'm sorry I didn't tell you this earlier, I failed to realize that this could be relevant.

abdeladim-s commented 1 year ago

You are welcome @CyberSinister. I hope we will find a solution together.

So you are using Windows Hyper-V, why not just use WSL ? it is more efficient ? Also have you tried to use just Windows without any VM ?

So what I want you to do is to try llama.cpp first. I will explain to you how:

Let me know if you find any issues.

CyberSinister commented 1 year ago

Hey @abdeladim-s , thank you once again.

TLDR; It works without having to make any changes to the CMakeLists.txt.

Yes I'm using using Windows Hyper-V. I use Hyper-V because that's where I want to use this. I have multiple VMs running Ubuntu for different tasks and I wanted to create a Flask wrapper around GPT4All so I can create my own API with it. I've already done this a while ago and still am currently using it with OpenAI's API which works fine but it was getting too expensive so now I wanted to host something of my own and not go broke in an attempt to learn something new xD. I'm a Full Stack Developer but I'm really curious and enjoy working with LLMs. I haven't yet tried to use just Windows without a VM because I wanted to deploy it on the server directly and use it there.

Anyways, so I did what you asked and here's what I have for you sir:

I later realized that I can't just directly use models downloaded from the gtp4all website (I might be wrong but I tried 4 different models) so I downloaded a converted a model from http://gpt4all.io/models/ggml-gpt4all-l13b-snoozy.bin which ACTUALLY worked.

But it was too slow - I gave it a small prompt "What are you?" and after 25 minutes, all it wrote was "I am an AI-language mod" and it was still trying to complete that response. I tried increasing the number of threads by passing the --threads 4 argument when executing the main file but that didn't make much of a difference.

I guess it works, I don't know how to speed it up - I'm guessing that using a smaller model will be faster but do I add a GPU if I want to work with 13B models? Thank you for getting me this far, I've learnt a bit more than I knew yesterday and wouldn't have gotten there without your help. Are we now ready for the next step?

I'm sorry if such a long response wasn't something you were expecting but I thought it would help others not make the same mistakes as me - but then again, not everyone's as dumb as me.

abdeladim-s commented 1 year ago

hey @CyberSinister,

Thank you for the detailed response, it will certainly help others running into the same issue. And you are not dumb, it just a learning curve that we all had to go through it first, just keep up the patience and the hard work and enjoy the process :smile:

That being said, let us now move to the next step:

If it didn't work let me know what issue you ran into ?