msqr1 / Vosklet

A speech recognizer that can run on the browser, inspired by vosk-browser
MIT License
24 stars 1 forks source link

Not work with different model #2

Closed korabelnikov closed 1 month ago

korabelnikov commented 4 months ago

On the step of model's creation (malloc) JS throw error without text but with 1 number, it looks like number of bytes tried to allocate.

Have you built model with fixed memory size? doesn't it allow memory_grow?

msqr1 commented 4 months ago

I tried not to set ALLOW_MEMORY_GROWTH with pthread because I do lots of JS read and writes. See https://emscripten.org/docs/porting/pthreads.html?#special-considerations

last section about pthread and allow memory growth

korabelnikov commented 4 months ago

@msqr1 I tried to rebuild using your make script, I've set 600mb instead of 300, but the error persists.

model, I used, from vosk-browser

https://github.com/ccoreilly/vosk-browser/raw/gh-pages/models/vosk-model-small-ru-0.4.tar.gz

korabelnikov commented 4 months ago

image

the execution steps after model creation but then I got error with pointer to std::string (I suppose)

msqr1 commented 4 months ago

Without getting my hand dirty, I can't know what is going on. You can try with o0, g3 and assertions and sanitizers to debug, I'll try it later tonight. If you discover anything, please share it so I can improve this!

korabelnikov commented 4 months ago

good, idea, I'll try it right now

korabelnikov commented 4 months ago
Vosklet.js:418 Aborted(native code called abort())
threadPrintErr @ Vosklet.js:418
abort @ Vosklet.js:903
__abort_js @ Vosklet.js:1959
$abort @ Vosklet.wasm-01f04bca:0x2b12f
$std::__2::__throw_out_of_range[abi:nn180100](char const*) @ Vosklet.wasm-01f04bca:0x2cefbb
$std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::__throw_out_of_range[abi:nn180100]() const @ Vosklet.wasm-01f04bca:0x2cefb1
$std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::basic_string(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const&, unsigned long, unsigned long, std::__2::allocator<char> const&) @ Vosklet.wasm-01f04bca:0x18069
$std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::substr[abi:ne180100](unsigned long, unsigned long) const @ Vosklet.wasm-01f04bca:0x16eda
$genericModel::extractAndLoad(int, int)::$_0::operator()() const @ Vosklet.wasm-01f04bca:0x1590e
$decltype(std::declval<genericModel::extractAndLoad(int, int)::$_0&>()()) std::__2::__invoke[abi:ne180100]<genericModel::extractAndLoad(int, int)::$_0&>(genericModel::extractAndLoad(int, int)::$_0&) @ Vosklet.wasm-01f04bca:0x157d4
$void std::__2::__invoke_void_return_wrapper<void, true>::__call[abi:ne180100]<genericModel::extractAndLoad(int, int)::$_0&>(genericModel::extractAndLoad(int, int)::$_0&) @ Vosklet.wasm-01f04bca:0x15787
$std::__2::__function::__alloc_func<genericModel::extractAndLoad(int, int)::$_0, std::__2::allocator<genericModel::extractAndLoad(int, int)::$_0>, void ()>::operator()[abi:ne180100]() @ Vosklet.wasm-01f04bca:0x15612
$std::__2::__function::__func<genericModel::extractAndLoad(int, int)::$_0, std::__2::allocator<genericModel::extractAndLoad(int, int)::$_0>, void ()>::operator()() @ Vosklet.wasm-01f04bca:0x155d9
$std::__2::__function::__value_func<void ()>::operator()[abi:ne180100]() const @ Vosklet.wasm-01f04bca:0x1387d
$std::__2::function<void ()>::operator()() const @ Vosklet.wasm-01f04bca:0x137d5
$genericModel::extractAndLoad(int, int)::$_1::operator()() const @ Vosklet.wasm-01f04bca:0x13787
$decltype(std::declval<genericModel::extractAndLoad(int, int)::$_1>()()) std::__2::__invoke[abi:ne180100]<genericModel::extractAndLoad(int, int)::$_1>(genericModel::extractAndLoad(int, int)::$_1&&) @ Vosklet.wasm-01f04bca:0x13732
$void std::__2::__thread_execute[abi:ne180100]<std::__2::unique_ptr<std::__2::__thread_struct, std::__2::default_delete<std::__2::__thread_struct>>, genericModel::extractAndLoad(int, int)::$_1>(std::__2::tuple<std::__2::unique_ptr<std::__2::__thread_struct, std::__2::default_delete<std::__2::__thread_struct>>, genericModel::extractAndLoad(int, int)::$_1>&, std::__2::__tuple_indices<...>) @ Vosklet.wasm-01f04bca:0x12f77
$void* std::__2::__thread_proxy[abi:ne180100]<std::__2::tuple<std::__2::unique_ptr<std::__2::__thread_struct, std::__2::default_delete<std::__2::__thread_struct>>, genericModel::extractAndLoad(int, int)::$_1>>(void*) @ Vosklet.wasm-01f04bca:0x12b03
invokeEntryPoint @ Vosklet.js:1743
handleMessage @ Vosklet.js:516
Vosklet.js:418 worker: onmessage() captured an uncaught exception: RuntimeError: Aborted(native code called abort())
threadPrintErr @ Vosklet.js:418
handleMessage @ Vosklet.js:543
Vosklet.js:418 RuntimeError: Aborted(native code called abort())
    at abort (http://127.0.0.1:5000/static/Vosklet.js:922:11)
    at __abort_js (http://127.0.0.1:5000/static/Vosklet.js:1959:7)
    at Vosklet.wasm.abort (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[1146]:0x2b12f)
    at Vosklet.wasm.std::__2::__throw_out_of_range[abi:nn180100](char const*) (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[7047]:0x2cefbb)
    at Vosklet.wasm.std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::__throw_out_of_range[abi:nn180100]() const (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[7046]:0x2cefb1)
    at Vosklet.wasm.std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::basic_string(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const&, unsigned long, unsigned long, std::__2::allocator<char> const&) (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[449]:0x18069)
    at Vosklet.wasm.std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>>::substr[abi:ne180100](unsigned long, unsigned long) const (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[422]:0x16eda)
    at Vosklet.wasm.genericModel::extractAndLoad(int, int)::$_0::operator()() const (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[411]:0x1590e)
    at Vosklet.wasm.decltype(std::declval<genericModel::extractAndLoad(int, int)::$_0&>()()) std::__2::__invoke[abi:ne180100]<genericModel::extractAndLoad(int, int)::$_0&>(genericModel::extractAndLoad(int, int)::$_0&) (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[410]:0x157d4)
    at Vosklet.wasm.void std::__2::__invoke_void_return_wrapper<void, true>::__call[abi:ne180100]<genericModel::extractAndLoad(int, int)::$_0&>(genericModel::extractAndLoad(int, int)::$_0&) (wasm://wasm/Vosklet.wasm-01f04bca:wasm-function[408]:0x15787)

but pointer to the tar is correct, I compared memory view with offline hex viewer image

korabelnikov commented 4 months ago

According to assmebly, it fails at std::string::substr, so I think, it's here image

korabelnikov commented 4 months ago

I managed out, I repacked tarball given by vosk-browser (I removed top-level hidden file, and all the hidden files there).

msqr1 commented 4 months ago

So the issue was that model files wasn't tgz'ed in the top-level directory, right?

korabelnikov commented 4 months ago

not exact, the issue was in that, you have a line in code which assume that all the files in tarball have '/' in their path, and then you cut the dirname out. But tarball from vosk-browser contains some hidden files at top level. attempting to find '/' returns -1 then substr(path, -1, len(path))) crashes the program

msqr1 commented 4 months ago

Alr imma fix this! Thanks for your help!

msqr1 commented 4 months ago

@korabelnikov It took a long time, but it seemed that I fixed my code. Could you recheck so I can close the issue?

twihorses commented 2 months ago

hola , me parece genial este código, solo quisiera me indicaras como crear el Vosklet.js para Español . https://ccoreilly.github.io/vosk-browser/models/vosk-model-small-es-0.3.tar.gz , me gustaría poder contribuir ayudando de alguna manera. gracias

msqr1 commented 2 months ago

@twihorses Hola, lo siento, tuve que usar Google Translate. De todos modos, puede copiar el código del README de nivel superior en un archivo html, luego cambiar el modelo en inglés por el español y debería funcionar. Puede verificar la red devtool y la consola para verificar que el modelo se esté recuperando y cargando.