ggerganov / llama.cpp

LLM inference in C/C++
MIT License
66.93k stars 9.61k forks source link

Bug: Unloading model on Android crashes app #9946

Open LandonPatmore opened 1 week ago

LandonPatmore commented 1 week ago

What happened?

Looks like trying to call unload() anywhere within the app causes a native crash. Not sure why it is happening. Confirmed across both our app, as well as the sample app by adding a clear model button that calls:

  viewModelScope.launch {
      try {
          llamaAndroid.unload()
      } catch (exc: IllegalStateException) {
          messages += exc.message!!
      }
  }

Which underneath is calling:

    suspend fun unload() {
        withContext(runLoop) {
            when (val state = threadLocalState.get()) {
                is State.Loaded -> {
                    free_context(state.context)
                    free_model(state.model)
                    free_batch(state.batch)
                    free_sampler(state.sampler);

                    threadLocalState.set(State.Idle)
                }
                else -> {}
            }
        }
    }

Currently running on a Pixel 8, but I don't believe the type of device is the cause. This means that once you load one model, you cannot load anymore, until you close the app.

Name and Version

Android: 14 build number AP2A.240905.003.B1 Mac OS: 14.7 (23H124) Chip: Apple M3 Max

Commit head: 9e041024481f6b249ab8918e18b9477f873b5a5e

What operating system are you seeing the problem on?

Other? (Please let us know in description)

Relevant log output

2024-10-18 15:20:34.836 23621-23673 libc                    com.example.llama                    A  Pointer tag for 0xc9cf000001000115 was truncated, see 'https://source.android.com/devices/tech/debug/tagged-pointers'.
2024-10-18 15:20:34.836 23621-23673 libc                    com.example.llama                    A  Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 23673 (Llm-RunLoop), pid 23621 (m.example.llama)
---------------------------- PROCESS STARTED (23678) for package com.example.llama ----------------------------
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A  Cmdline: com.example.llama
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A  pid: 23621, tid: 23673, name: Llm-RunLoop  >>> com.example.llama <<<
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #02 pc 00000000000b6b40  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk!libllama.so (offset 0x28d000) (llama_batch_free+88) (BuildId: b55c9505ae3978594a8a737e646f32945267196f)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #03 pc 000000000001d538  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk!libllama-android.so (offset 0x1bc9000) (Java_android_llama_cpp_LLamaAndroid_free_1batch+64) (BuildId: 79fce581486052a056b338555c0fd3f51c40c5fc)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #09 pc 0000000000002588  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk (android.llama.cpp.LLamaAndroid.access$free_batch+0)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #14 pc 000000000000204c  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk (android.llama.cpp.LLamaAndroid$unload$2.invokeSuspend+0)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #25 pc 0000000000001d2c  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk (android.llama.cpp.LLamaAndroid$runLoop$1$1.invoke+0)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #30 pc 0000000000001cec  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk (android.llama.cpp.LLamaAndroid$runLoop$1$1.invoke+0)
2024-10-18 15:20:35.074 23676-23676 DEBUG                   crash_dump64                         A        #35 pc 0000000000283394  /data/app/~~MHFewHzDtnmuDj2SY3VIVg==/com.example.llama-o1LZcctxEpFb_Jd-Cq48BQ==/base.apk (kotlin.concurrent.ThreadsKt$thread$thread$1.run+0)

If you need more info, let me know, thanks for the library it works great otherwise!

slaren commented 1 week ago

I am assuming that you are working with the android example. On a quick glance to llama-android.cpp, there are some issues with Java_android_llama_cpp_LLamaAndroid_free_1batch. Since it allocates the batches itself in Java_android_llama_cpp_LLamaAndroid_new_1batch, it should also free it itself rather than calling the llama.cpp library function, which may be linked to a different version of the standard library. Additionally, it does not free the llama_batch object, and that will cause a leak. I am not sure if that's the cause of your crashes, but it's something you could look into.