Whisper failing due to a multitude of error; rehash of Newest Gradio prevents Whisper extension from functioning

Describe the bug

I originally made this issue: https://github.com/oobabooga/text-generation-webui/issues/5850

Then I saw this commit: https://github.com/oobabooga/text-generation-webui/pull/5856#issuecomment-2073426890

I downloaded this commit of textgen: https://github.com/oobabooga/text-generation-webui/commits/main/

I tried whisper in firefox, it seemed to work.

Then I tried longer conversations and superboogav2 and started to encounter a lot of issues:

Issues I am having:

The audio that is being sent to whisper is somehow being degraded, the transcriptions are not accurate, this will happen when superboogaV2 is loaded while in instruct mode.
Firefox Browser crashed and closed when I tried to transcribe after a few back and forths in instruct mode with superboogav2 installed.
Transcriptions take longer to complete, the Whisper UI taskbar glitches out and moves backwards and flickers, this is usually when an error is going to occur with and without superboogav2 installed.
Transcriptions not possible in Google Chrome I get the error below (see 2 of 2 in logs) when trying to use Chrome:
Transcriptions will occasionally send over and over again until I refresh the page with superboogav2 installed (see screenshot).
Transcriptions glitched out in Firefox (see 1 of 2 in logs) The "Error" bubble is all that populates the Whisper STT UI element (see screenshot).

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Download the latest repo: https://github.com/oobabooga/text-generation-webui/commits/main/
Install with start_linux.sh
Run update_wizard_linux.sh, select "B" to install extensions that come with textgen
Start textgen
In sessions tab, select whisper_stt Then superboogav2 OR superboogav2 Then whisper_stt (I realize that sometimes issues with loading multiple extensions can be alieviated by changing the load sequence)
Load your model
Go to the instruct Mode (occasionally get a disconnect error here in my browser, I randomly get them while loading models, using extensions, and doing nothing in particular; textgen does not crash and the UI elements look like they are working still without refreshing the browser page) I did not see this behavior prior to the major gradio update.
Load medium.en as the model; a note here: I have never experience transcription errors when using this model with the previous gradio version. I was constantly astonished that it could trasribe extreamly long ramblings with perfect precision, I say this to demonstrate that there is a stark constrast in the transcription quality when oobaboogav2 is being used at the same time. It's difficult for me to fully describe this issue with screenshots.
Spend some time talking to the model, yes the first go around will probably work, maybe even 2 or 3 but 5 and beyond glitches and errors keep occurring and you will notice a loss in transcription quality, say words that might rhyme like "pool" and it will transcribe as "fool"; this never has happened before. I have a set of tests I conduct on new instances of textgen and have never encountered these transcription questions before when trying to reference my "Radium Pool" document using superboogav2.

Screenshot

Screenshot from 2024-04-23 18-57-36

Screenshot from 2024-04-22 20-19-37

Logs

(1 of 2)**********************Stopped working would not transcribe anymore in firefox*****************
18:21:04-029078 INFO     Starting Text generation web UI                        

Running on local URL:  http://127.0.0.1:7860

Closing server running on port: 7860
18:22:52-978758 INFO     Loading the extension "whisper_stt"                    
18:22:52-982841 INFO     Loading the extension "superboogav2"                   
18:22:55-724926 DEBUG    Loading hyperparameters...                             

Running on local URL:  http://127.0.0.1:7860

18:23:54-659832 INFO     Loading "WizardLM-2-8x22B_8bitexllamav2"               
18:30:16-549212 INFO     LOADER: "ExLlamav2"                                    
18:30:16-550080 INFO     TRUNCATION LENGTH: 65536                               
18:30:16-550775 INFO     INSTRUCTION TEMPLATE: "Vicuna-v1.1"                    
18:30:16-551340 INFO     Loaded the model in 381.89 seconds.                    
18:33:18-855805 ERROR    Failed to load the model.                              
Traceback (most recent call last):
  File "/home/myself/Desktop/OobApril23/text-generation-webui/modules/ui_model_menu.py", line 257, in load_model_wrapper
    yield output
GeneratorExit

Exception ignored in: <generator object load_model_wrapper at 0x7c39baacac40>
Traceback (most recent call last):
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 845, in run
    context, func, args, future, cancel_scope = item
                   ^^^^
RuntimeError: generator ignored GeneratorExit
Output generated in 3.16 seconds (12.99 tokens/s, 41 tokens, context 45, seed 397214322)
18:34:40-917258 INFO     Successfully deleted 0 records from chromaDB.          
18:34:41-777671 INFO     Adding 4 new embeddings.                               
Output generated in 7.59 seconds (18.71 tokens/s, 142 tokens, context 125, seed 1495054165)
18:35:14-914587 INFO     Successfully deleted 4 records from chromaDB.          
18:35:15-070058 INFO     Adding 10 new embeddings.                              
Output generated in 6.51 seconds (18.60 tokens/s, 121 tokens, context 311, seed 577916308)
18:35:40-763241 INFO     Successfully deleted 10 records from chromaDB.         
18:35:40-764128 INFO     Adding 4 cached embeddings.                            
Output generated in 7.95 seconds (19.74 tokens/s, 157 tokens, context 125, seed 1933917278)
18:36:09-213948 INFO     Successfully deleted 4 records from chromaDB.          
18:36:09-418010 INFO     Adding 10 new embeddings.                              
Output generated in 4.01 seconds (18.22 tokens/s, 73 tokens, context 301, seed 1177444089)
18:36:46-224827 INFO     Successfully deleted 10 records from chromaDB.         
18:36:46-226132 INFO     Adding 10 cached embeddings.                           
Output generated in 5.49 seconds (18.57 tokens/s, 102 tokens, context 307, seed 1936546368)
18:37:13-257191 INFO     Successfully deleted 10 records from chromaDB.         
18:37:13-258465 INFO     Adding 10 cached embeddings.                           
Output generated in 3.61 seconds (18.29 tokens/s, 66 tokens, context 300, seed 1020024456)
18:37:17-849129 INFO     Successfully deleted 10 records from chromaDB.         
18:37:17-851364 INFO     Adding 6 cached embeddings.                            
18:37:17-924719 INFO     Adding 4 new embeddings.                               
Output generated in 2.44 seconds (18.48 tokens/s, 45 tokens, context 373, seed 1171930464)
18:37:22-771671 INFO     Successfully deleted 10 records from chromaDB.         
18:37:22-774370 INFO     Adding 6 cached embeddings.                            
18:37:22-882152 INFO     Adding 7 new embeddings.                               
Output generated in 1.85 seconds (19.46 tokens/s, 36 tokens, context 418, seed 1388647206)
18:38:59-780584 INFO     Successfully deleted 13 records from chromaDB.         
18:38:59-781971 INFO     Adding 10 cached embeddings.                           
Output generated in 3.85 seconds (17.68 tokens/s, 68 tokens, context 318, seed 1005208421)
18:39:04-627909 INFO     Successfully deleted 10 records from chromaDB.         
18:39:04-630476 INFO     Adding 6 cached embeddings.                            
18:39:04-743302 INFO     Adding 5 new embeddings.                               
Output generated in 3.96 seconds (18.67 tokens/s, 74 tokens, context 422, seed 1706874654)
18:40:02-965701 INFO     Successfully deleted 11 records from chromaDB.         
18:40:02-967089 INFO     Adding 10 cached embeddings.                           
Output generated in 9.35 seconds (18.40 tokens/s, 172 tokens, context 377, seed 445174954)
Traceback (most recent call last):
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/queueing.py", line 527, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1786, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1338, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 759, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/extensions/whisper_stt/script.py", line 48, in auto_transcribe
    transcription = do_stt(audio, whipser_model, whipser_language)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/extensions/whisper_stt/script.py", line 36, in do_stt
    transcription = r.recognize_whisper(audio_data, language=whipser_language, model=whipser_model)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/__init__.py", line 1486, in recognize_whisper
    wav_bytes = audio_data.get_wav_data(convert_rate=16000)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/audio.py", line 146, in get_wav_data
    raw_data = self.get_raw_data(convert_rate, convert_width)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/audio.py", line 91, in get_raw_data
    raw_data, _ = audioop.ratecv(
                  ^^^^^^^^^^^^^^^
audioop.error: not a whole number of frames
18:41:32-942701 INFO     Successfully deleted 10 records from chromaDB.         
18:41:32-944021 INFO     Adding 10 cached embeddings.                           
Output generated in 10.11 seconds (19.78 tokens/s, 200 tokens, context 377, seed 1117809996)

(1 of 2)**********************Stopped working would not transcribe anymore in firefox*****************

**********************Error Received In Chrome*****************
18:49:41-743248 INFO     Loading "WizardLM-2-8x22B_8bitexllamav2"               
18:50:02-577812 INFO     LOADER: "ExLlamav2"                                    
18:50:02-578722 INFO     TRUNCATION LENGTH: 65536                               
18:50:02-581187 INFO     INSTRUCTION TEMPLATE: "Vicuna-v1.1"                    
18:50:02-582101 INFO     Loaded the model in 20.84 seconds.                     
Traceback (most recent call last):
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/queueing.py", line 527, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/route_utils.py", line 261, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1786, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1338, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 759, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/extensions/whisper_stt/script.py", line 48, in auto_transcribe
    transcription = do_stt(audio, whipser_model, whipser_language)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/extensions/whisper_stt/script.py", line 36, in do_stt
    transcription = r.recognize_whisper(audio_data, language=whipser_language, model=whipser_model)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/__init__.py", line 1486, in recognize_whisper
    wav_bytes = audio_data.get_wav_data(convert_rate=16000)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/audio.py", line 146, in get_wav_data
    raw_data = self.get_raw_data(convert_rate, convert_width)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/myself/Desktop/OobApril23/text-generation-webui/installer_files/env/lib/python3.11/site-packages/speech_recognition/audio.py", line 91, in get_raw_data
    raw_data, _ = audioop.ratecv(
                  ^^^^^^^^^^^^^^^
audioop.error: not a whole number of frames
**********************Error Received In Chrome*****************

System Info

Ubuntu, nvidia, 4090

oobabooga / text-generation-webui