Closed wl-junlin closed 1 year ago
it was said in the comment "Silero VAD models were trained using 512, 1024, 1536 samples for 16000 sample rate" so, for a better acuuracy, should i chosse 1536 as my window_size_samples? however, for a better lantancy, i should choose 512?
The bigger the window size, the higher the quality. With an obvious latency trade off.
it was said in the comment "Silero VAD models were trained using 512, 1024, 1536 samples for 16000 sample rate" so, for a better acuuracy, should i chosse 1536 as my window_size_samples? however, for a better lantancy, i should choose 512?