V-Sekai / godot-whisper

An GDExtension addon for the Godot Engine that enables realtime audio transcription, supports OpenCL for most platforms, Metal for Apple devices, and runs on a separate thread.
MIT License
65 stars 6 forks source link

Run Whisper in its own thread to not block the main thread. #32

Closed aiaimimi0920 closed 9 months ago

aiaimimi0920 commented 9 months ago

31

Run Whisper in its own thread to not block the main thread Add test demo Use Engine::get_singleton() Just load res file path,not load model

fire commented 9 months ago

I noticed that this codes uses stl versions of godot engine primitives methods, like std::mutex vs godot::mutex. threads etc. Thoughts?

aiaimimi0920 commented 9 months ago

Yes, using std:: thread,

Do you want me to modify it to godot:: thread?

Ughuuu commented 9 months ago

Yes, using std:: thread,

Do you want me to modify it to godot:: thread?

I would say to investigate and see, if it's not complicated use the godot ones.

aiaimimi0920 commented 9 months ago

Thank you for your suggestion. I will try to change this PR

aiaimimi0920 commented 9 months ago
  1. I have modified the code according to the modification suggestions
  2. Please note that the current project cannot run. I replaced std:: thread with godot:: thread, and there was an error during the actual code execution:
    WARNING: A Thread object is being destroyed without its completion having been realized.
    Please call wait_to_finish() on it to ensure correct cleanup.
     at: Thread::~Thread (core\os\thread.cpp:104)

The entire project will be frozen, and I don't know why. I need help to solve it

aiaimimi0920 commented 9 months ago

Added a review. One concern might be that if people don't want it on the thread, eg if they just want a function that receives audio, with this it won't work anymore. But that's ok, I don't think anyone would have that use case that I know of. But it would be cool if they would have, maybe have on the singleton a function that just does whisper transcribe without the thread using a full audio.

I understand what you mean. If you want to obtain this feature, perhaps it would be more reasonable to implement it in another PR, otherwise this PR would need to be filled with too much content

fire commented 9 months ago

singleton a function that just does whisper transcribe without the thread using a full audio.

We talked about this being ok on a different pr.

Ughuuu commented 9 months ago

I am getting a crash in mutex, investigating why.

aiaimimi0920 commented 9 months ago

61a6697

This submission addresses issues other than using godot:: thread, and I have tested it on Windows. If the crash caused by godot:: thread cannot be resolved, I can roll it back to this version

Ughuuu commented 9 months ago

Interesting. I say we change back then. We might figure our later why it doesnt work with Godot thread

Ughuuu commented 9 months ago

Revert it and then test it again and should be good maybe. Ill test it locally after also.

aiaimimi0920 commented 9 months ago

Interesting. I say we change back then. We might figure our later why it doesnt work with Godot thread

It's not necessarily a problem caused by the Godot thread, it may be something wrong with the way I use it.

aiaimimi0920 commented 9 months ago

https://github.com/V-Sekai/godot-whisper/assets/153103332/c624ef9a-06e5-42ff-a80d-5a94d2ef654b

I have switched to the version of std:: thread. If you are interested in making godot:: thread available, You can roll back this commit: https://github.com/V-Sekai/godot-whisper/pull/32/commits/2f2e2f6e27faae9242c5c9183f174b3417f9a37b

Ughuuu commented 9 months ago

Checking now locally.

Ughuuu commented 9 months ago

I'm getting a crash on Mac:

Godot Engine v4.2.1.stable.official.b09f793f5 - https://godotengine.org
Vulkan API 1.2.231 - Forward+ - Using Vulkan Device #0: Apple - Apple M1

  ================================================================
  handle_crash: Program crashed with signal 11
  Engine version: Godot Engine v4.2.1.stable.official (b09f793f564a6c95dc76acc654b390e68441bd01)
  Dumping the backtrace. Please include this when reporting the bug to the project developer.
  [1] 1   libsystem_platform.dylib            0x0000000186b95a24 _sigtramp + 56
  [2] SpeechToText::run()
  [3] SpeechToText::run()
  [4] decltype(*std::declval<SpeechToText*>().*std::declval<void (SpeechToText::*)()>()()) std::__1::__invoke[abi:v160006]<void (SpeechToText::*)(), SpeechToText*, void>(void (SpeechToText::*&&)(), SpeechToText*&&)
  [5] void std::__1::__thread_execute[abi:v160006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (SpeechToText::*)(), SpeechToText*, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (SpeechToText::*)(), SpeechToText*>&, std::__1::__tuple_indices<2ul>)
  [6] void* std::__1::__thread_proxy[abi:v160006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (SpeechToText::*)(), SpeechToText*>>(void*)
  [7] 7   libsystem_pthread.dylib             0x0000000186b66034 _pthread_start + 136
--- Debugging process stopped ---
Ughuuu commented 9 months ago

Seems it's accesing this->context_instance but it's null. Oh, I think my model is empty. In this case it should not crash either way. Checking with correct model.

Ughuuu commented 9 months ago

So one change would be in the run function to make it no-op and just wait until the context is created. Another thing, if I set a model, it crashes:

5   libgodot_whisper.macos.template_debug.dev.universal        0x11b03fd1c ggml_backend_metal_supports_family + 44 (ggml-metal.m:1762)
6   libgodot_whisper.macos.template_debug.dev.universal        0x11b03fd1c ggml_backend_metal_supports_family + 44 (ggml-metal.m:1762)
7   libgodot_whisper.macos.template_debug.dev.universal        0x11af9b554 whisper_backend_init(whisper_context_params const&) + 160 (whisper.cpp:1081)
8   libgodot_whisper.macos.template_debug.dev.universal        0x11afa09dc whisper_model_load(whisper_model_loader*, whisper_context&) + 17260 (whisper.cpp:1519)
9   libgodot_whisper.macos.template_debug.dev.universal        0x11af9c430 whisper_init_with_params_no_state + 76 (whisper.cpp:3302)
10  libgodot_whisper.macos.template_debug.dev.universal        0x11af9c5f0 whisper_init_from_buffer_with_params_no_state + 180 (whisper.cpp:3293)
11  libgodot_whisper.macos.template_debug.dev.universal        0x11afa17d0 whisper_init_from_buffer_with_params + 56 (whisper.cpp:3330)
12  libgodot_whisper.macos.template_debug.dev.universal        0x11af1b3f0 SpeechToText::load_model() + 224 (speech_to_text.cpp:363)
13  libgodot_whisper.macos.template_debug.dev.universal        0x11af1b2d8 SpeechToText::set_language_model(godot::Ref<WhisperResource>) + 48 (speech_to_text.cpp:351)
14  libgodot_whisper.macos.template_debug.dev.universal        0x11af2efc8 void godot::call_with_variant_args_helper<godot::_gde_UnexistingClass, godot::Ref<WhisperResource>, 0ul>(godot::_gde_UnexistingClass*, void (godot::_gde_UnexistingClass::*)(godot::Ref<WhisperResource>), godot::Variant const**, GDExtensionCallError&, IndexSequence<0ul>) + 524 (binder_common.hpp:237)
15  libgodot_whisper.macos.template_debug.dev.universal        0x11af2ed00 void godot::call_with_variant_args_dv<godot::_gde_UnexistingClass, godot::Ref<WhisperResource>>(godot::_gde_UnexistingClass*, void (godot::_gde_UnexistingClass::*)(godot::Ref<WhisperResource>), void const* const*, int, GDExtensionCallError&, std::__1::vector<godot::Variant, std::__1::allocator<godot::Variant>> const&) + 740 (binder_common.hpp:311)
16  libgodot_whisper.macos.template_debug.dev.universal        0x11af2e680 godot::MethodBindT<godot::Ref<WhisperResource>>::call(void*, void const* const*, long long, GDExtensionCallError&) const + 104 (method_bind.hpp:323)
17  libgodot_whisper.macos.template_debug.dev.universal        0x11b048c78 godot::MethodBind::bind_call(void*, void*, void const* const*, long long, void*, GDExtensionCallError*) + 100 (method_bind.cpp:96)
18  Godot                                  0x10879cfbc 0x104a
Ughuuu commented 9 months ago

Also, there are places where you do whisper_free(context_instance) without a lock, so eg:


void SpeechToText::load_model() {
    whisper_free(context_instance);
    if (model.is_null()) {
        return;
    }
    PackedByteArray data = model->get_content();
    if (data.is_empty()) {
        return;
    }
    context_instance = whisper_init_from_buffer_with_params((void *)(data.ptr()), data.size(), context_parameters);
    UtilityFunctions::print(whisper_print_system_info());
}

If the run thread runs and this frees the instance(for some reason someone changes model at runtime) it'll crash. We should probably lock most things that are used in the thread.

Ughuuu commented 9 months ago

Interesting, the crash I am getting is because I was never actually testing with useGPU true it seems on Mac. So when that is enabled that didn't actually work. So it's not related to this PR.

Edit:

No, it seems I still get crash.

Ughuuu commented 9 months ago

Fixed the crash by adding a check. Also made errors send to godot and print them.

Ughuuu commented 9 months ago

It seems the issue that it wasn't loading the model was my fault, I wasn't building the debug build correctly.

Ughuuu commented 9 months ago

Thanks for everything @aiaimimi0920

aiaimimi0920 commented 9 months ago

Thank you for your help and guidance @fire @Ughuuu