SPEED TESTING; add speed tests here folks!

In my program I used faster-whisper to transcribe an audio file. The large-v2 model running in float16 took 10 minutes to process the Sam Altman audio file.

After implementing this library I got the following:

Large-v2 running on float16 with batch size of 50 = 54 seconds Medium.en, float16, batch size of 75 = 32 seconds Small.en, float16, batch size of 100 = 15 seconds!

Amazing!

Tests run on RTX 4090 with CUDA 12 and pytorch 2.2.0. Just thought you'd like to know.

Also, that's using the higher quality ASR parameters:

            'asr_options': {
                "beam_size": 5,
                "best_of": 1,
                "patience": 2,
                "length_penalty": 1,
                "repetition_penalty": 1.01,
                "no_repeat_ngram_size": 0,
                "compression_ratio_threshold": 2.4,
                "log_prob_threshold": -1.0,
                "no_speech_threshold": 0.5,
                "prefix": None,
                "suppress_blank": True,
                "suppress_tokens": [-1],
                "without_timestamps": True,
                "max_initial_timestamp": 1.0,
                "word_timestamps": False,
                "sampling_temperature": 1.0,
                "return_scores": True,
                "return_no_speech_prob": True,
                "word_aligner_model": 'tiny',
            },
            'model_identifier': model_identifier,
            'backend': 'CTranslate2',
        }

If you increase the batch size (regardless of size of the whisper model) where it exceeds available VRAM, the speeds dropped significantly, but this is expected behavior).

https://github.com/BBC-Esq/ChromaDB-Plugin-for-LM-Studio/releases/tag/v4.0.0

shashikg / WhisperS2T

SPEED TESTING; add speed tests here folks! #37