Mobile-Artificial-Intelligence / maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
MIT License
1.25k stars 124 forks source link

Still running in CPU #604

Open mmw1984 opened 1 month ago

mmw1984 commented 1 month ago

Screenshot_2024-08-05-22-01-16-01_507096bf411ffee187df405bf527ca60 Sorry that the screenshot is not in English

mmw1984 commented 1 month ago

Screenshot_2024-08-05-22-01-10-46_507096bf411ffee187df405bf527ca60 Device: Oneplus Ace 2 Pro Chioset: Snapdragon 8 Gen 2 System: Android 14 RAM: 16GB

danemadsen commented 3 weeks ago

fixed

ILOVEPIE commented 3 weeks ago

No, it's not. The new changes include the x86_64 Linux build of the Vulkan SDK, not the correct ARM version, so maid_llm doesn't compile with Vulkan support.

danemadsen commented 3 weeks ago

Awe thats no fun

danemadsen commented 2 weeks ago

@ILOVEPIE any idea how to install that in github actions

ILOVEPIE commented 2 weeks ago

@ILOVEPIE any idea how to install that in github actions

The only thing I can think of off the top of my head is installing the dart binding for Vulkan, but I'm not sure if that'll work. I'll do a little more research.

ILOVEPIE commented 2 weeks ago

Apparently the Vulkan headers and libraries come as part of the Android SDK. I need to figure out where Flutter puts that so that you can adjust your CMake options to point to that.

danemadsen commented 2 weeks ago

No because the gitlab pipeline compiles it with vulkan fine.

download this build here: https://gitlab.com/mobile-artificial-intelligence/maid/-/jobs/7663138914

its an issue with the github action thats causing it to not detect the vulkan headers

danemadsen commented 2 weeks ago

nvm its not working there either

ILOVEPIE commented 2 weeks ago

OK, I figgured it out... probably. the docs on this are terrible but i have an idea now. Ill test it on my fork then PR the changes.

danemadsen commented 2 weeks ago

Ok, much appreciated if you can get it working.

ILOVEPIE commented 2 weeks ago

Ok, much appreciated if you can get it working.

Assuming I don't need to make any other minor adustments to the build file, and the binary doesn't crash, I should be able to send that PR over shortly. I've already got it detecting the android vulkan library. There were some outdated paths in the CMakeLists though, which caused the build to fail when vulkan was available. So, fingers crossed that it doesn't crash when I install and test it.

ILOVEPIE commented 2 weeks ago

Ugg... The vulkan c++ headers aren't in the NDK, only the c ones.

danemadsen commented 2 weeks ago

You got this bro, i believe in you.

ILOVEPIE commented 2 weeks ago

You got this bro, i believe in you.

I have some good news and some bad news. I got the Vulkan support compiling. I haven't tested it yet, but I got it to find the headers and library in the compile phase. The problem is it's going to require a higher minimum version of android than we currently require, the llama.cpp vulkan implementation (for the version of llama.cpp we're currently using) requires (at least) vulkan 1.1 support (maybe higher) which means Android 14 minimum. The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't. And then we also give them an option to toggle GPU acceleration off, because phone SOC manufacturers aren't known for their GPUs and GPU Drivers. What I mean is phones tend to have fairly buggy GPUs, so we should at least have the option to turn the acceleration off.

danemadsen commented 2 weeks ago

The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't.

Its probably best to just ship 2 releases, 1 for vulkan / android 14 and 1 for versions below 14. Obviously it would be best to allow the user to switch between GPU / CPU (and eventually NPU) but for now shipping 2 apk's / bundles is probably a faster solution.

ILOVEPIE commented 2 weeks ago

The only solution I can come up with is having both a CPU-based version of the library and a GPU-based version of the library, and we detect if the user has Vulkan support and use the CPU library if they don't.

Its probably best to just ship 2 releases, 1 for vulkan / android 14 and 1 for versions below 14. Obviously it would be best to allow the user to switch between GPU / CPU (and eventually NPU) but for now shipping 2 apk's / bundles is probably a faster solution.

To be honest, I think it's about the same amount of work to do either option. So I'd rather do the more permanent solution.

ILOVEPIE commented 2 weeks ago

Just about to test the vulkan build.

ILOVEPIE commented 1 week ago

I've been trying to figure out why the Vulkan build is crashing. I'm not exactly sure. I'm getting some sort of weird illegal signal on ARM. I'm not too familiar with the ARM architecture. So I'm not sure if this is indicating an illegal instruction or some type of illegal register value or something. I don't know.

ILOVEPIE commented 1 week ago

I'm going to attempt to make a x86_64 build of the app to see if that will elucidate anything.