msqr1 / Vosklet

A speech recognizer that can run on the browser, inspired by vosk-browser
MIT License
33 stars 1 forks source link

working example? #1

Closed korabelnikov closed 5 months ago

korabelnikov commented 6 months ago

I tried to run on IDE devserver, different services but get error about SharedBufferArray I created a simple Flask with necessary CORS headers, but now it fails at

let model = await module.createModel("en-model.tgz","model","ID")

with

Uncaught (in promise) TypeError: Failed to fetch

msqr1 commented 6 months ago

Make sure you have your URL (en-model.tgz) to point to a valid one. The one on the example should be in the examples folder. You can download the model and place it in the directory of the HTML file, or you can replace it with https://github.com/msqr1/Vosklet/raw/main/examples/en-model.tgz to fetch it directly from github

korabelnikov commented 6 months ago

Make sure you have your URL (en-model.tgz) to point to a valid one. The one on the example should be in the examples folder. You can download the model and place it in the directory of the HTML file, or you can replace it with https://github.com/msqr1/Vosklet/raw/main/examples/en-model.tgz to fetch it directly from github

Thank you, for a quick response, unfortunately, the accessibility of the model files is not a case.

127.0.0.1 - - [13/May/2024 10:10:18] "GET /static/Vosklet.js HTTP/1.1" 304 - 127.0.0.1 - - [13/May/2024 10:10:28] "GET /static/en-model.tgz HTTP/1.1" 200 -

I digged in a bit, an error arise at tar = await new Response(tar.pipeThrough(new DecompressionStream("gzip"))).arrayBuffer(); at the call of arrayBuffer()

korabelnikov commented 6 months ago

I found out, that browser extract gzipped stream on the fly, I removed content-type of model.tgz to avoid this

usage of https://github.com/msqr1/Vosklet/raw/main/examples/en-model.tgz seems impossible due to CORS restrictions, so I still use my local file

korabelnikov commented 6 months ago

removing gzip piping from Vosklet.js works

jakespracher commented 6 months ago

Ah crap I am hitting this too.

removing gzip piping from Vosklet.js works

@msqr1 should this be removed from the lib? Or maybe the case where the browser decompressed the file should be handled as a special case?

msqr1 commented 6 months ago

Basically, for faster load times, I store a gzipped copy of the model in OPFS. OPFS is not fetchable, so I have to manually decompress it before loading. To do this, the first time a model is fetched, we receive a compressed model, tee the stream so that one (still compressed) goes into OPFS, and one (get decompressed) goes to model loading. If the browser decompress beforehand, I have to compress before storage (to save space), which I think is slower than decompressing.

A solution I think for this would be doing this would be letting the browser decompress, and compress again (1st time cost only) to store it, and proceed as normal.

jakespracher commented 6 months ago

Basically, for faster load times, I store a gzipped copy of the model in OPFS. OPFS is not fetchable, so I have to manually decompress it before loading. To do this, the first time a model is fetched, we receive a compressed model, tee the stream so that one (still compressed) goes into OPFS, and one (get decompressed) goes to model loading. If the browser decompress beforehand, I have to compress before storage (to save space), which I think is slower than decompressing.

A solution I think for this would be doing this would be letting the browser decompress, and compress again (1st time cost only) to store it, and proceed as normal.

This would be internal within the library, right? I don't have full context since I'm not familiar with the codebase but that sounds reasonable to me

msqr1 commented 6 months ago

Yeah, because first time startup can be a tiny but slower due to compression, but it's just once. I'll do that later today.

msqr1 commented 6 months ago

@jakespracher, I tried fixing it, by removing manual decompression, can you try it again?