imayhaveborkedit / discord-ext-voice-recv

Voice receive extension package for discord.py
https://pypi.org/project/discord-ext-voice-recv/
MIT License
111 stars 17 forks source link

How to save voiceData? #21

Open wa008 opened 2 months ago

wa008 commented 2 months ago

I try to save voiceData in format of bytes, but I cannot play it.

code:

history_audio = b''
def callback_func_demo(user, data):
    global history_audio
    audio = data.pcm # I use pcm because BasicSink.wants_opus always equal False.
    history_audio += audio
    if len(history_audio) > 500000:
        now_second = str(datetime.datetime.now().strftime("%Y%m%d%H%M%S"))
        _size = len(history_audio)
        print (f'now_second: {now_second}, history_audio: {_size}')
        with open(f'./voice_data/recoding_{now_second}.wav', 'wb') as wav_file:
            wav_file.write(history_audio)
        history_audio = b''

@client.event
async def on_ready():
    print('Logged in as {0.id}/{0}'.format(client.user))
    channel = discord.utils.get(client.get_all_channels(), name='General')
    voice_channel = client.get_channel(channel.id)

    vc = await voice_channel.connect(cls=voice_recv.VoiceRecvClient, timeout = 60, reconnect = True)
    sink = voice_recv.BasicSink(callback_func_demo)
    print (sink.__class__.__name__) # BasicSink
    vc.listen(sink)
    # vc.stop()

saved file:

image

when I paly, the length of audio is 0.

image

If I check the file:

$ ffprobe -hide_banner -loglevel fatal -show_error -show_format -print_format json ./voice_data/recoding_20240827144134.wav
{
    "error": {
        "code": -1094995529,
        "string": "Invalid data found when processing input"
    }
}

machin infor:

      System Version: macOS 14.5 (23F79)
      Kernel Version: Darwin 23.5.0
      Boot Volume: Macintosh HD
wa008 commented 2 months ago

I found I can save the audio to file in below code:

def callback_func(user, data):
    global history_audio
    audio = data.pcm
    history_audio += audio
    if len(history_audio) > 1000000:
        tmp_audio = history_audio
        history_audio = b''

        audio_data = np.frombuffer(tmp_audio, dtype=np.int32, offset=0) # transfer types to numpy
        sample_rate = 48000
        scipy.io.wavfile.write(f'./voice_data/recoding_2.wav', sample_rate, audio_data) # save file 

What make me surprise is I must read voiceData with format of np.int32 when transfer types to numpy, But the document of discord.py use np.int16(16-bit) as default, this is confusing if I am not wrong.