Bip-Rep / sherpa

A mobile Implementation of llama.cpp
MIT License
293 stars 35 forks source link

Update llama.cpp and move core processing to native code #12

Open dsd opened 1 year ago

dsd commented 1 year ago

Thanks for taking the initiative on Sherpa; I was also curious about the combination of low end devices, flutter, and open source AI, and it was nice to see that you had already been working on this.

It wasn't working on my phone (Samsung Galaxy S10) due to a llama.cpp crasher, and also memory exhaustion, but with the changes made here it works rather well with new 3B models.

moh21amed commented 1 year ago

can it run 3b on mobile with 3gb ram ?

dsd commented 1 year ago

can it run 3b on mobile with 3gb ram ?

Not sure, if you want to try it then there is an apk here I suspect it won't work, because the 3B files I have seen are around 2GB, and I suspect your base OS is using at least 1GB RAM...

dsd commented 1 year ago

Looks like you have the wrong llama.cpp available under src/ Did you initialize it from git submodules?

windmaple commented 1 year ago

Somehow my src folder was messed up. I downloaded your src zip file and it works now. Great work, btw!

dsd commented 1 year ago

Also debug mode is slooow - remember to run with --release it will be much faster :)

windmaple commented 1 year ago

I think there is sth. missing for Mac

flutter: llamasherpa loaded flutter: MessageNewLineFromIsolate : [isolate 09:02:55] llamasherpa loaded flutter: filePath : /Volumes/Macintosh HD/Users/wind-test/Desktop/orca-mini-3b.ggmlv3.q4_1.bin [ERROR:flutter/runtime/dart_isolate.cc(1097)] Unhandled exception: Invalid argument(s): Failed to lookup symbol 'llamasherpa_start': dlsym(RTLD_DEFAULT, llamasherpa_start): symbol not found

0 DynamicLibrary.lookup (dart:ffi-patch/ffi_dynamic_library_patch.dart:33:70)

1 NativeLibrary._llamasherpa_startPtr

generated_bindings_llamasherpa.dart:41

2 NativeLibrary._llamasherpa_startPtr (package:sherpa/generated_bindings_llamasherpa.dart)

generated_bindings_llamasherpa.dart:1

3 NativeLibrary._llamasherpa_start

generated_bindings_llamasherpa.dart:42

4 NativeLibrary._llamasherpa_start (package:sherpa/generated_bindings_llamasherpa.dart)

generated_bindings_llamasherpa.dart:1

5 NativeLibrary.llamasherpa_start

generated_bindings_llamasherpa.dart:27

6 Lib.binaryIsolate

I did not see similar issue on Linux.

dsd commented 1 year ago

Yeah, I don't have any macOS/iOS experience or devices. Do the official sherpa versions work there? If you want to try you could look at the instructions here under "FFI on macOS and iOS". You will need to build both llamasherpa and llama.cpp as mentioned, via Xcode/Runner.

Then as for this bit:

nativeApiLib = Platform.isMacOS || Platform.isIOS ? DynamicLibrary.process()

The equivalent of this bit is already handled, so you just have to include the C++ code in the build and then it might work.

windmaple commented 1 year ago

Yeah, the original version works on Mac, although it crashes if you run the model twice in a row, which is a separate issue.

NandhaKishorM commented 1 year ago

Yeah, I don't have any macOS/iOS experience or devices. Do the official sherpa versions work there? If you want to try you could look at the instructions here under "FFI on macOS and iOS". You will need to build both llamasherpa and llama.cpp as mentioned, via Xcode/Runner.

Then as for this bit:

nativeApiLib = Platform.isMacOS || Platform.isIOS ? DynamicLibrary.process()

The equivalent of this bit is already handled, so you just have to include the C++ code in the build and then it might work.

In windows its showing error such as the dll library not Found. The file is "llamasherpa.dll"

kmn1024 commented 10 months ago

Does any one have benchmarks (tokens/second) for running a 3B or 7B model on any low end device? @dsd mentioned 3B running "quite well" on S10 in https://github.com/Bip-Rep/sherpa/pull/12#issue-1784005807; how well is that =) @windmaple also seems to have found success on an unknown device in https://github.com/Bip-Rep/sherpa/pull/12#issuecomment-1619765764.