talonvoice / wav2letter

Facebook AI Research Automatic Speech Recognition Toolkit
Other
23 stars 4 forks source link

Building for the C(++) API #1

Closed erksch closed 3 years ago

erksch commented 4 years ago

Hey there!

First of all, again thank you for all the work and effort for wav2letter especially the C++ API!

I am having some questions about how to build the library (libw2l) (on Linux btw) needed for using the API.

I kind of want to avoid trying things out, because building takes ages on my machine :D

Many thanks!

lunixbochs commented 4 years ago

Which model/architecture are you trying to use?

erksch commented 4 years ago

Hey @lunixbochs, So I spent the last day to get a better understanding of the concepts of wav2letter. As far as I know, if I had a streaming convnet ready, I could use streaming inference API of wav2letter. The lib you provided decodes audio chunks with a "normal" conv glu models, doesn't it?

So to reframe the quesion, can I use a convglu ASG model to decode microphone audio with your API? And which branch should I use?

Regarding cmake, W2L_BUILD_INFERENCE wouldn't make any sense because it's not a TDS + CTC model.

lunixbochs commented 4 years ago

Use the decoder or decoder-ng branches. Yes, but you'll want to run a VAD and pass the audio in chunks instead of small 500ms segments.

I believe decoder-ng can handle any current wav2letter architecture.