beeldengeluid / dane-whisper-asr-worker

MIT License
2 stars 0 forks source link

benchmark performance (Broadcast News) #37

Closed Veldhoen closed 2 months ago

Veldhoen commented 2 months ago

benchmark with the same setup for fair comparison:

greenw0lf commented 2 months ago

Parameters to experiment with that apply to all implementations:

greenw0lf commented 2 months ago

Labelled data benchmarked, unlabelled data still to go

greenw0lf commented 2 months ago

Small mistake: loading the diarization model for each file instead of just once for WhisperX (alignment model needs to be loaded per file as it uses language info)

greenw0lf commented 2 months ago

Re-benchmarking faster-whisper (both labelled and unlabelled) because time to transcribe should be measured from the point when model.transcribe is run until the output of the function is saved to a JSON file.

Also re-benchmarking WhisperX for the reason mentioned in the previous comment (expected less time spent per file)