Saved audio recorded with SR plays choppy and too fast

Steps to reproduce

Record audio from an USB audio interface (Focusrite Scarlett) with Microphone() instance
Save to file with Pythons wave module

Here's an exemplary code that shows what I do (copied together from actual source):

import speech_recognition as sr
import wave

mic_index = 7  # focusrite scarlett input

recognizer = sr.Recognizer()
mic = sr.Microphone(device_index=mic_index)

print('Recording...')
with mic as source:
    recognizer.adjust_for_ambient_noise(source, duration=0.2)
    audio = recognizer.listen(source, timeout=1, phrase_time_limit=5)

wave_file = wave.open('audiotest.wav', 'wb')
wave_file.setnchannels(1)
wave_file.setsampwidth(2)
wave_file.setframerate(16000)
wave_file.writeframes(audio.get_wav_data(convert_rate=16000))
wave_file.close()

Expected behaviour

The written wave file should sound like the original audio source: clean and correct tempo

Actual behaviour

The written wave file sounds somewhat choppy and way too fast. audiotest.wav.zip

Recording audio from the device with arecord -D plughw:1,0 -f cd -d 5 alsatest.wav produces a clean result.

System information

(Delete all the statements that don't apply.)

My system is Linux Mint 20.3 Cinnamon.

My Python version is 3.8.10.

My Pip version is 20.0.2.

My SpeechRecognition library version is 3.9.0.

My PyAudio library version is 0.2.13

My microphones are:

HDA NVidia: HDMI 0 (hw:0,3)
HDA NVidia: HDMI 1 (hw:0,7)
HDA NVidia: HDMI 2 (hw:0,8)
HDA NVidia: HDMI 3 (hw:0,9)
HDA NVidia: HDMI 4 (hw:0,10)
HDA NVidia: HDMI 5 (hw:0,11)
HDA NVidia: HDMI 6 (hw:0,12)
Scarlett 2i2 USB: Audio (hw:1,0)
HD-Audio Generic: ALC1220 Analog (hw:2,0)
HD-Audio Generic: ALC1220 Digital (hw:2,1)
HD-Audio Generic: ALC1220 Alt Analog (hw:2,2)
C922 Pro Stream Webcam: USB Audio (hw:3,0)
hdmi
pulse
default

My working microphones are:

  7: 'Scarlett 2i2 USB: Audio (hw:1,0)', 
  11: 'C922 Pro Stream Webcam: USB Audio (hw:3,0)', 
  13: 'pulse', 
  14: 'default'
}

Uberi / speech_recognition