Open hbredin opened 1 year ago
I like it, that way we can automatically set the lowest possible latency. This could be implemented as --step auto
, but also somewhere in the python API
Another idea: Implement this as a diart.profile recording.wav
that also runs a quick grid search on that file to suggest hyper-parameter values without running a costly tuning.
This would be useful for people that don't have much data but have a "typical" conversation that the system will encounter. Then diart would quickly suggest a config to get started.
step
controls the minimum algorithmic latency of the speaker diarization pipeline.Targetting real-time processing, one needs to make sure that the processing latency (i.e. the time it takes to process one step) is smaller than this algorithmic latency.
Said differently: the lower bound on the algorithmic latency is the processing latency, which in turns, depends on the computing power of the machine the pipeline runs on (e.g. GPU is usually faster than CPU).
Would be nice to provide an API to automatically estimate this lower bound by running a few steps when pipeline is instantiated and measuring the time it takes so that
step
can be set automatically to processing latency + a little safety net.