Closed erksch closed 3 years ago
Which model/architecture are you trying to use?
Hey @lunixbochs, So I spent the last day to get a better understanding of the concepts of wav2letter. As far as I know, if I had a streaming convnet ready, I could use streaming inference API of wav2letter. The lib you provided decodes audio chunks with a "normal" conv glu models, doesn't it?
So to reframe the quesion, can I use a convglu ASG model to decode microphone audio with your API? And which branch should I use?
Regarding cmake, W2L_BUILD_INFERENCE wouldn't make any sense because it's not a TDS + CTC model.
Use the decoder or decoder-ng branches. Yes, but you'll want to run a VAD and pass the audio in chunks instead of small 500ms segments.
I believe decoder-ng can handle any current wav2letter architecture.
Hey there!
First of all, again thank you for all the work and effort for wav2letter especially the C++ API!
I am having some questions about how to build the library (libw2l) (on Linux btw) needed for using the API.
cmake
command look like? -> Do I need to build for inference (hence add W2L_BUILD_INFERENCE option) to enable streaming capabilites?I kind of want to avoid trying things out, because building takes ages on my machine :D
Many thanks!