castorini / howl

Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Mozilla Public License 2.0
201 stars 30 forks source link

window size #81

Open Jundo26 opened 3 years ago

Jundo26 commented 3 years ago

why max window size == 500ms ? Is it because the duration of a word is about 500ms?

ljj7975 commented 3 years ago

I think max_window_size can be a misleading name. It is prefixed with max as samples can possibly have variable length (shorter than the max_window_size)

The window is the single unit of a sample that is preprocessed together and fed into the model The following code should be self-explanatory https://github.com/castorini/howl/blob/55f8b92081dc5c7b98ef3a2a52de15966be663dc/howl/model/inference.py#L169-L193