alphacep / vosk-flutter

Apache License 2.0
51 stars 40 forks source link

How can I load the model from external directory of the physical android device when the app initiates in flutter version #12

Closed animesh27dev closed 11 months ago

animesh27dev commented 1 year ago

Is there any way after modelLoader to load the vosk models from external directory in flutter?

sergsavchuk commented 1 year ago

Hi, there is no such function right now. I thought about this when I implemented ModelLoader, but decided that it wouldn't be very useful. Could you please tell us more about your case?

animesh27dev commented 1 year ago

I was trying to import large vosk models using loadFromNetwork method, but the loading never completes on the application side. I noticed the network transfer rate significantly changed when I open the application, but it did not stop even after 20-30 mins of downloading at 7-8 MB/s.

After long time it shows Error: FormatException: Could not find End of Central Directory Record

then I tried to import from assets but there is also some errors that happen while executing

com.android.build.gradle.internal.tasks.CompressAssetsWorkAction
   > Java heap space
sergsavchuk commented 1 year ago

Oh, actually big models are not suitable for mobile devices cause they require a lot of resources, you should only use small models on Android. Info from https://alphacephei.com/vosk/models:

We have two types of models - big and small, small models are ideal for some limited task on mobile applications. They can run on smartphones, Raspberry Pi’s. They are also recommended for desktop applications. Small model typically is around 50Mb in size and requires about 300Mb of memory in runtime. Big models are for the high-accuracy transcription on the server. Big models require up to 16Gb in memory since they apply advanced AI algorithms. Ideally you run them on some high-end servers like i7 or latest AMD Ryzen. On AWS you can take a look on c5a machines and similar machines in other clouds. Most small model allow dynamic vocabulary reconfiguration. Big models are static the vocabulary can not be modified in runtime.

Perhaps we should filter out all models except small from the loadModelsList() output, so that no one gets confused in the future :thinking:

sergsavchuk commented 1 year ago

P. S. If you really need to use a big model for your purposes, you should run it on your server using vosk-server and send recognition requests from your device. I once tried a WebSocket server with a big model and used web_socket_channel to connect to it. It worked really well!

animesh27dev commented 1 year ago

Okay, can I stream audio data from Flutter client to vosk-server using flutter_sound after getting the audio stream from toStream , direct to the server? I need the stream of recognized data from the server in real-time also. I need an example of accurate recognition using big model on Flutter side.

sergsavchuk commented 1 year ago

You can use websocket/test_microphone.py as a reference, you need to do exactly the same thing, but in Dart:

        # part of test_microphone.py
        async with websockets.connect(args.uri) as websocket: # establish a websocket connection
            await websocket.send('{ "config" : { "sample_rate" : %d } }' % (device.samplerate)) # send the recognition config to the server

            while True:
                data = await audio_queue.get() # get audio bytes from the microphone
                await websocket.send(data) # send them to the server
                print (await websocket.recv()) #  recieve a recognition result from the server

            await websocket.send('{"eof" : 1}') # send the end-of-recognition message
            print (await websocket.recv())