RFC: Voice Receive API Design/Usage

imayhaveborkedit commented 6 years ago

As I have progressed through writing and redesigning this feature a few times, Danny and I have come to the conclusion regarding the inclusion of voice receive in discord.py. Discord considers voice receive a second class citizen as a feature and will likely never officially support or document it. With no such guarantees, all development is based on reverse engineering and is liable to be broken by discord at any point.

The conclusion is that voice receive as a discord bot feature does not belong in the library. However, the alternative is to simply use an extension module to implement it. See https://github.com/Rapptz/discord.py/pull/9288#issuecomment-1785942942 for more details.

This is exactly what I've been working on. https://github.com/imayhaveborkedit/discord-ext-voice-recv/

The foundational work has been largely complete and the code is functional, but as stated in the readme it's not quite complete, not guaranteed stable and subject to change. Basic documentation is done but more comprehensive docs and examples are on the todo list. It also requires v2.4 of discord.py (currently the master branch), not yet released on pypi at the time of writing this.

Old issue content

This information is technically outdated, but a large amount of the design still applies. --------------- ### Note: DO NOT use this in production. The code is messy (and possibly broken) and probably filled with debug prints. Use only with the intent to experiment or give feedback, although almost everything in the code is subject to change. Behold the voice receive RFC. This is where I ask for design suggestions and feedback. Unfortunately not many people seem to have any idea of what their ideal voice receive api would look like so it falls to me to come up with everything. Should anyone have any questions/comments/concerns/complaints/demands please post them here. I will be posting the tentative design components here for feedback and will update them occasionally. For more detailed information on my progress see [the project on my fork](https://github.com/imayhaveborkedit/discord.py/projects/1). I will also be adding an example soonish. ## Overview The main concept behind my voice receive design is to mirror the voice send api as much as possible. However, due to receive being more complex than send, I've had to take some liberties in creating some new concepts and functionality for the more complex parts. The basic usage should be relatively familiar: ```py vc = await channel.connect() vc.listen(MySink()) ``` The voice send api calls an object that produces PCM packets a `Source`, whereas the receive api refers to them as a `Sink`. Sources have a `read()` function that produces PCM packets, so Sinks have a `write(data)` function that does something with PCM packets. Sinks can also optionally accept opus data to bypass the decoding stage if you so desire. The signature of the `write(data)` function is currently just a payload blob with the opus data, pcm data, and rtp packet, mostly for my own convenience during development. This is subject to change later on. The new VoiceClient functions are basically the same as the send variants, with `listen()` being the new counterpart to `play()`. >Note: The `stop()` function has been changed to stop both playing **and** listening. I have added `stop_playing()` and `stop_listening()` for individual control. ## Built in Sinks For simply saving voice data to a file, you can use the built in `WaveSink` to write them to a wav file. The way I have this currently implemented, however, is completely broken for more than one user. >Note: Here lies my biggest problem. I currently do not have any way to combine multiple voice "streams" into one stream. The way this works is Discord sends packets for all users on the same socket, differentiated by an id (aka ssrc, from RTP spec). These packets have timestamps, but with a random start offset, per ssrc. RTP has a mechanism where the reference time is sent in a control packet, but as far as I can tell, Discord *doesn't send these control packets*. As such, I have no way of properly synchronizing streams without excessive guesswork based on arrival time in the socket (unreliable at best). Until I can solve this there will be a few holes in the design, for example, how to record the whole conversation in a voice channel instead of individual users. Sinks can be composed much like Sources can (PCMVolumeTransformer+FFmpegPCMAudio, etc). I will have some built in sinks for handling various control actions, such as filtering by user or predicate. ```py # only listen to message.author vc.listen(UserFilter(MySink(), message.author)) # listen for 10 seconds vc.listen(TimedFilter(MySink(), 10)) # arbitrary predicate, could check flags, permissions, etc vc.listen(ConditionalFilter(MySink(), lambda data: ...)) ``` and so forth. As usual, these are subject to change when I go over this part of the design again. >As mentioned before, mixing is still my largest unsolved problem. Combining all voice data in a channel into one stream is surely a common use case, and i'll do my best to try and figure out a solution, but I can't promise anything yet. If it turns out that my solution is too hacky, I might have to put it in some ext package on pypi (see: ext.colors). For volume control, I recently found that libopus has a gain setting in the decoder. This is probably faster and more accurate than altering pcm packets after they've been decoded. Unfortunately, I haven't quite figured out how to expose this setting yet, so I don't have any public api to show for it. That should account for most of the public api part that i've designed so far. I still have a lot of miscellaneous things to do so no ETA. Again, if you have any feedback whatsoever please make yourself known either here or in the discord server.

Youareyou64 commented 3 years ago

I apologize in advance for not knowing this, I'm fairly new to git and how development of new features in libraries work. I know that this isn't implemented into the full code yet, however I'd really like to test some stuff out with this. How can I actually get this into my code? I know it's not an actual library, so I'm a bit confused. If it's easier to just link to an article or something that explains this, feel free to. I really don't want to be that annoying person who doesn't do their own research, but believe me, I've tried.

NormHarrison commented 3 years ago

@Youareyou64 It essentially is the same as the regular Discord.py library, except with additions from Imayhaveborked related to receiving audio data from voice channels. You can download it via many different methods (git clone, downloading the repository as a standard ZIP via Github directly etc.) and then run the setup.py file to install it just like any other Python module/library. Although assuming you already have the standard/official Discord.py installed, you will probably want to setup a Python virtual environment for your project (some IDE's have this built-in I believe) to prevent this variant from affecting other projects which don't make use of it. After that, you would import it the same way that you do the regular library (import discord) and you'd then have access to the methods and objects specific to this fork.

Since changes in the API have occurred on Discord's end though, there's at least two things that you will need to change internally before this fork will be usable again, the first is this, and the second is the problem that I encountered directly above your post regarding a change in the returned port number. Hopefully this helps.

Youareyou64 commented 3 years ago

@NormHarrison Ah, I see, thank you. Just one quick question, what I'm commenting under now is obviously an Issue, with no files/code attached. Is there a fork or something I should be looking in where the code is available for download? Apologies if I'm mixing up terms somewhere or missing something.

SebbyLaw commented 3 years ago

@Youareyou64 At the top of the issue, there is a link to the forked repository: https://github.com/imayhaveborkedit/discord.py

Youareyou64 commented 3 years ago

Welp, I've been trying to get this to work but so many things just keep going wrong. You don't happen to know of any already merged ones that I can use, right?

Gorialis commented 3 years ago

The branch hasn't been updated for a while, pending a redesign. I manually made all of the voice recv related changes over master a while back, and I've cherry picked these changes to be up to date with the current master at https://github.com/Gorialis/discord.py/tree/voice-recv-mk3

You can install it directly using:

pip install -U "discord.py[voice] @ git+https://github.com/Gorialis/discord.py@voice-recv-mk3"

Consistency and production warnings still apply. This branch is likely to spam your console while in use, and things may (and probably will) go wrong (and there are plenty of known bugs at the moment).

I intend to at some point work out voice ws flow myself and try to see if I can breathe life back into this project, perhaps looking at other libs that have implemented voice receive successfully for inspiration regarding sane frontends. No ETA on that yet, though.

RemiZacharias commented 3 years ago

I would like to provide some examples I made using this API. A record command which records user's voice from API.

@bot.command()
async def record(ctx: commands.Context, time: FutureTime, me_only: bool):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    wave_file = waves_folder / waves_file_format.format(number)
    wave_file.touch()
    fp = wave_file.open('rb')
    if me_only:
        ctx.voice_client.listen(discord.UserFilter(discord.WaveSink(str(wave_file)), ctx.author))
    else:
        ctx.voice_client.listen(discord.WaveSink(str(wave_file)))
    await discord.utils.sleep_until(time.dt)
    ctx.voice_client.stop_listening()
    # print(discord.File(fp, filename='record.wav'))
    await ctx.send("Recording being sent. Please wait!")
    await ctx.send('Here\'s, your record file.', file=discord.File(fp, filename=str(wave_file.name)))
    number += 1

The command is not that great, but still fine. I'll keep this updated as much as possible. Next one. Uses gTTS and pydub. A Text to Speech command.

@bot.command(aliases=['tts'])
async def text_to_speech(ctx: commands.Context, lang: Optional[str] = None, *, message: str):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    tts_file = tts_folder / tts_file_format.format(ctx.author, number)
    gtts.gTTS(message, lang=lang).save(str(tts_file))
    tts_file_wav = tts_file.with_suffix('.wav')
    pydub.AudioSegment.from_mp3(tts_file).export(tts_file_wav, format='wav')

    if not ctx.voice_client.is_playing():
        ctx.voice_client.play(discord.FFmpegPCMAudio(str(tts_file_wav)))
    number += 1

Not that great, I'm still, figuring out a better way to do this, any help welcome. Anyway, next one is Speech to Text, Uses SpeechRecognition.

@bot.command(aliases=['stt'])
async def speech_to_text(ctx: commands.Context, time: FutureTime, me_only: bool = True):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    sr_file = sr_folder / sr_file_format.format(ctx.author, number)
    sr_file.touch()
    fp = sr_file.open('rb')
    if me_only:
        ctx.voice_client.listen(discord.UserFilter(discord.WaveSink(str(sr_file)), ctx.author))
    else:
        ctx.voice_client.listen(discord.WaveSink(str(sr_file)))
    await discord.utils.sleep_until(time.dt)
    ctx.voice_client.stop_listening()
    await ctx.send("Recognizing your voice, please wait!")
    recognizer = speech_recognition.Recognizer()
    with speech_recognition.AudioFile(fp) as source:
        sr_audio_data = recognizer.record(source)
    # print(recognizer.recognize_google(sr_audio_data, language='en-US'))
    await ctx.send("I think this is right, maybe, \n Here's your Speech-To-Text \n > " + recognizer.recognize_google(sr_audio_data, language='en-US'))
    number += 1

Well, recognition is bad, I'll provide some kinds of Pictures, which shows the commands usage. And well, those are incomplete, let me provide you the some extra codes which makes all 3 commands work correctly without changing a single line of code.

number_txt_file = Path.cwd() / 'number.txt'
number_txt_file.touch(exist_ok=True)
number = int(number_txt_file.open('r').read() or 0)
waves_folder = (Path.cwd() / 'recordings')
waves_file_format = "recording{}.wav"
waves_folder.mkdir(parents=True, exist_ok=True)
tts_folder = (Path.cwd() / 'tts')
tts_folder.mkdir(parents=True, exist_ok=True)
tts_file_format = "tts{}{}.mp3"
sr_folder = (Path.cwd() / 'sr')
sr_folder.mkdir(parents=True, exist_ok=True)
sr_file_format = "sr{}{}.wav"

This is to make the dirs and files, to save the recordings, the recording will be deleted when there are 10+ recording files. To save some space and to abide the privacy of users.

@tasks.loop(seconds=4)
async def save_number_loop():
    global number
    with number_txt_file.open('w') as fp:
        fp.write(str(number))
    if len(list(waves_folder.iterdir())) > 10:
        print("Deleting recording files as the recording file's count got above 10.")
        for item in waves_folder.iterdir():
            # print(item)
            item.unlink()
        number = 0

This deletes the recordings and writes the number as well. To save the number if the bot is restarted. I might write a Voice Chat bot, if I figured out how to check if the user has stopped speaking and the voice data he spoke.

Excuse me being dumb but this code actually works and can actually listen to users? GG.

Yes, it works without issues and without any changes. If your code looks like this.

from pathlib import Path
from typing import Optional

import discord
import gtts
import pydub
import speech_recognition
from discord.ext import commands, tasks

from utils.time import FutureTime

discord.opus.load_opus(str(Path.cwd() / "waves\libopus-0.x64.dll"))
# print(discord.opus.is_loaded())
# print(Path.cwd() / 'waves')
import discord
import logging

logger = logging.getLogger('discord')
logger.setLevel(logging.DEBUG)
handler = logging.FileHandler(filename='discord.log', encoding='utf-8', mode='w')
handler.setFormatter(logging.Formatter('%(asctime)s:%(levelname)s:%(name)s: %(message)s'))
logger.addHandler(handler)
bot = commands.Bot('.')
number_txt_file = Path.cwd() / 'number.txt'
number_txt_file.touch(exist_ok=True)
number = int(number_txt_file.open('r').read() or 0)
waves_folder = (Path.cwd() / 'recordings')
waves_file_format = "recording{}.wav"
waves_folder.mkdir(parents=True, exist_ok=True)
tts_folder = (Path.cwd() / 'tts')
tts_folder.mkdir(parents=True, exist_ok=True)
tts_file_format = "tts{}{}.mp3"
sr_folder = (Path.cwd() / 'sr')
sr_folder.mkdir(parents=True, exist_ok=True)
sr_file_format = "sr{}{}.wav"

@bot.event
async def on_ready():
    print('Running bot')
    print(bot.user.id)
    print(bot.user)

async def ensure_voice(ctx):
    if not ctx.author.voice:
        # "Fist join a Voice Channel, you man!"
        await ctx.send("Fist join a Voice Channel, you man!")
        raise Exception

@bot.command()
@commands.before_invoke(ensure_voice)
async def record(ctx: commands.Context, time: FutureTime, me_only: bool):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    wave_file = waves_folder / waves_file_format.format(number)
    wave_file.touch()
    fp = wave_file.open('rb')
    if me_only:
        ctx.voice_client.listen(discord.UserFilter(discord.WaveSink(str(wave_file)), ctx.author))
    else:
        ctx.voice_client.listen(discord.WaveSink(str(wave_file)))
    await discord.utils.sleep_until(time.dt)
    ctx.voice_client.stop_listening()
    # print(discord.File(fp, filename='record.wav'))
    await ctx.send("Recording being sent. Please wait!")
    await ctx.send('Here\'s, your record file.', file=discord.File(fp, filename=str(wave_file.name)))
    number += 1

# @bot.event
# async def on_command_error(ctx, error):
#     if hasattr(error, 'original'):
#         error = error.original
#     if isinstance(error, NotImplementedError):
#         await ctx.send(error)

@bot.command()
async def test_send_music_api(ctx: commands.Context, wav_file):
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    if not ctx.voice_client.is_playing():
        ctx.voice_client.play(discord.FFmpegPCMAudio('waves/{}'.format(wav_file)))

@bot.command(aliases=['tts'])
async def text_to_speech(ctx: commands.Context, lang: Optional[str] = None, *, message: str):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    tts_file = tts_folder / tts_file_format.format(ctx.author, number)
    gtts.gTTS(message, lang=lang).save(str(tts_file))
    tts_file_wav = tts_file.with_suffix('.wav')
    pydub.AudioSegment.from_mp3(tts_file).export(tts_file_wav, format='wav')

    if not ctx.voice_client.is_playing():
        ctx.voice_client.play(discord.FFmpegPCMAudio(str(tts_file_wav)))
    number += 1

@bot.command(aliases=['stt'])
async def speech_to_text(ctx: commands.Context, time: FutureTime, me_only: bool = True):
    global number
    if not ctx.voice_client:
        await ctx.author.voice.channel.connect()
    sr_file = sr_folder / sr_file_format.format(ctx.author, number)
    sr_file.touch()
    fp = sr_file.open('rb')
    if me_only:
        ctx.voice_client.listen(discord.UserFilter(discord.WaveSink(str(sr_file)), ctx.author))
    else:
        ctx.voice_client.listen(discord.WaveSink(str(sr_file)))
    await discord.utils.sleep_until(time.dt)
    ctx.voice_client.stop_listening()
    await ctx.send("Recognizing your voice, please wait!")
    recognizer = speech_recognition.Recognizer()
    with speech_recognition.AudioFile(fp) as source:
        sr_audio_data = recognizer.record(source)
    # print(recognizer.recognize_google(sr_audio_data, language='en-US'))
    await ctx.send("I think this is right, maybe, \n Here's your Speech-To-Text \n > " + recognizer.recognize_google(sr_audio_data, language='en-US'))
    number += 1

@tasks.loop(seconds=4)
async def save_number_loop():
    global number
    with number_txt_file.open('w') as fp:
        fp.write(str(number))
    if len(list(waves_folder.iterdir())) > 10:
        print("Deleting recording files as the recording file's count got above 10.")
        for item in waves_folder.iterdir():
            # print(item)
            item.unlink()
        number = 0

save_number_loop.start()
bot.run('TOKEN_HERE')

I get an error by the opus load how do I fix that

Youareyou64 commented 3 years ago

All of the examples that I see here use from utils.time import FutureTime, which I'm assuming is a local import. Does anyone have an example of what that FutureTime module (apologies if module isnt the correct term) should contain?

xSavgs commented 3 years ago

Is there a link to download one of these "working" scripts along with opus to test this, I've been keen to try this for a while now

RedKinda commented 3 years ago

Is there a link to download one of these "working" scripts along with opus to test this, I've been keen to try this for a while now

https://github.com/Rapptz/discord.py/issues/1094#issuecomment-715657628

Multivalence commented 3 years ago

I've tried this. No errors but it creates a wav file that contains no audio. (All of this code is within a command)

vc = ctx.voice_client
vc.listen(discord.UserFilter(discord.WaveSink('file.wav'), ctx.author))
await asyncio.sleep(10)
vc.stop_listening()

FinThor commented 3 years ago

I've tried this. No errors but it creates a wav file that contains no audio. (All of this code is within a command)
vc = ctx.voice_client
vc.listen(discord.UserFilter(discord.WaveSink('file.wav'), ctx.author))
await asyncio.sleep(10)
vc.stop_listening()

Exact same for me, Noticed it happens only when running in the background without an interactive connection (python3.8 bot.py & and closing the terminal)

Edit: When running with nohup it works fine, probably needs somewhere to throw the stdout\err to (?)

xSavgs commented 3 years ago

Saw on r/python this it lead to this some sort of tutorial on how to use it but the interesting part was the this

It uses PCM audio. Isn't that what the sink for the discord vc uses? Could we use this Picovoice API to decode the PCM packets into audio? The guy shows live footage of him speaking and it working into a microphone here it seems very much like an Alexa for example its fast.

Edit: Here's the github page

NormHarrison commented 3 years ago

PCM is already a raw, un-encoded representation of audio, so there isn't anything to decode like there would be with Opus for example (what this library is already decoding from, and turning into PCM). You can make a custom sink and take the direct stream of PCM data being given to you upon every call to the write method and place it into a file opened for writing. If you take that file and import it as raw data in Audacity for example (selecting the correct sample rate etc.) it can be played back as normal. This is what the pre-made wav sink does already, with the addition of the wav file header so media players know how to handle the file properly. That does look like a neat project though, and you definitely could use this library as an audio source for it.

IgnacyFluder commented 3 years ago

Why am I getting this error TypeError: new() got an unexpected keyword argument 'deny_new'

DaRealCodeWritten commented 3 years ago

Update your discord.py, that error usually means your dpy is outdated

Why am I getting this error TypeError: new() got an unexpected keyword argument 'deny_new'

Also dont post in this thread, make a new one

IgnacyFluder commented 3 years ago

No I have the fork. Version: 1.3.0a2187+g0e06168

DaRealCodeWritten commented 3 years ago

Did i not just say stop posting on this thread, also which fork are you on

IgnacyFluder commented 3 years ago

Did i not just say stop posting on this thread, also which fork are you on

This one: https://github.com/imayhaveborkedit/discord.py

CorentinJ commented 3 years ago

I've tried this. No errors but it creates a wav file that contains no audio. (All of this code is within a command)
vc = ctx.voice_client
vc.listen(discord.UserFilter(discord.WaveSink('file.wav'), ctx.author))
await asyncio.sleep(10)
vc.stop_listening()
Exact same for me, Noticed it happens only when running in the background without an interactive connection (python3.8 bot.py & and closing the terminal)

Edit: When running with nohup it works fine, probably needs somewhere to throw the stdout\err to (?)

I am getting this as well, but it's not consistent... Either the file is empty (0 bytes), either it is valid (but remains used by the process until killed). I'm on windows however.

If trying to forcibly close the file after the recording, the RIFF header is there but there is no audio content (44 bytes).

LeadFreeCandy commented 3 years ago

I fixed this issue by ensuring that opus was loaded.

discord.opus.load_opus("C:\\Users\\Samir\\AppData\\Local\\Programs\\Python\\Python38\\Lib\\site-packages\\discord\\bin\\libopus-0.x64.dll")
print(discord.opus.is_loaded())

DjarDjar commented 3 years ago

Does this still get updated? Sorry I am really confused. Could someone show me an example of how to use one of these forks just to join and record the audio from a voice chat. I am new to discord.py so sorry if I am misleading or something.

pikaninja commented 3 years ago

There's #6507 for voice recieve

atbuy commented 3 years ago

Where does FutureTime come from? How do I import it if utils is a local file?

xNul commented 3 years ago

I've tried this. No errors but it creates a wav file that contains no audio. (All of this code is within a command)
vc = ctx.voice_client
vc.listen(discord.UserFilter(discord.WaveSink('file.wav'), ctx.author))
await asyncio.sleep(10)
vc.stop_listening()
Exact same for me, Noticed it happens only when running in the background without an interactive connection (python3.8 bot.py & and closing the terminal) Edit: When running with nohup it works fine, probably needs somewhere to throw the stdout\err to (?)
I am getting this as well, but it's not consistent... Either the file is empty (0 bytes), either it is valid (but remains used by the process until killed). I'm on windows however.

If trying to forcibly close the file after the recording, the RIFF header is there but there is no audio content (44 bytes).

I'm getting this issue too except every case that I've seen, the wav file is empty. I'm on Windows too. With

ctx.voice_client.listen(discord.WaveSink(str(sr_file)))

some audio is actually written, however, it's very quiet and it fades out very quickly even though I'm not changing my speaking volume in Discord. It's too quiet for the speech recognition to pick up as well.

xNul commented 3 years ago

Turned out I had an odd microphone issue and audio wasn't being inputted to Discord properly. Restarting my computer fixed it.

WieeRd commented 3 years ago

Is this ever making it to the master branch or an abandoned project?

Coddo-Python commented 3 years ago

Is this ever making it to the master branch or an abandoned project?

I think it has become abandoned. However another User who goes by the name Gorialis made a fork of the voice receive fork and updated it to a 2020 version of discord.py. So maybe you could try use that and make some changes for it to be compatible with the 2021 discord.py. The fork that he posted is here: https://github.com/Gorialis/discord.py/tree/voice-recv-mk3

Nikoscocos commented 3 years ago

discord.ext.commands.errors.CommandInvokeError: Command raised an exception: AttributeError: 'VoiceClient' object has no attribute 'listen' what?

WieeRd commented 3 years ago

@Nikoscocos This 'listen' feature is not in the official version of discord.py.
Did you install forks made by users or just using default discord.py?

Nikoscocos commented 3 years ago

@Nikoscocos This 'listen' feature is not in the official version of discord.py. Did you install forks made by users or just using default discord.py?

how do it? i new on github and im from russia

SNOWWORLD-star commented 3 years ago

Hello i am from Russia too Here is my code:

`import discord
import asyncio

class Bot(discord.Client):
    async def on_ready(self):
        channel = await client.fetch_channel('channel')
        vc = await channel.connect()
        print(f'bot joined {channel}')
        vc.listen(discord.UserFilter(discord.WaveSink('/home/vladislav/voice.wav'), user='user'))
        await asyncio.sleep(10)
        vc.stop_listening()

client = Bot()
client.run('my_token')`

Finally i have empty file and error: malloc(): corrupted top size Aborted(code dumped) I've already tried discord.py-voice-recv-mk2 and discord.py-voice-recv-mk3.zip Help pls

Revisto commented 3 years ago

Hello i am from Russia too Here is my code:
`import discord
import asyncio

class Bot(discord.Client):
    async def on_ready(self):
        channel = await client.fetch_channel('channel')
        vc = await channel.connect()
        print(f'bot joined {channel}')
        vc.listen(discord.UserFilter(discord.WaveSink('/home/vladislav/voice.wav'), user='user'))
        await asyncio.sleep(10)
        vc.stop_listening()

client = Bot()
client.run('my_token')`
Finally i have empty file and error: malloc(): corrupted top size Aborted(code dumped) I've already tried discord.py-voice-recv-mk2 and discord.py-voice-recv-mk3.zip Help pls

Hi, i have the same problem, can anyone help?

blastbeng commented 2 years ago

I've updated a fork of https://github.com/Gorialis/discord.py/tree/voice-recv-mk3 to the latest discord.py changes, client is working good, you can see the repo here: https://github.com/blastbeng/discord.py/tree/voice-recv-mk3

But there's something wrong with memory allocation, I've debugged a bit but I'm not really an expert about these type of errors:

python: malloc.c:2379: sysmalloc: Assertion (old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

The problem is inside reader.py (reader.AudioReader._decrypt_rtp_xsalsa20_poly1305_lite)

I'm using this gist to set up my bot:

 https://gist.github.com/dpy-manager-bot/fbf9e5233ac9d80e82f8968dad73b0fa

Anyone got any idea?

jacksonthall22 commented 1 year ago

Hello i am from Russia too Here is my code:
`import discord
import asyncio

class Bot(discord.Client):
    async def on_ready(self):
        channel = await client.fetch_channel('channel')
        vc = await channel.connect()
        print(f'bot joined {channel}')
        vc.listen(discord.UserFilter(discord.WaveSink('/home/vladislav/voice.wav'), user='user'))
        await asyncio.sleep(10)
        vc.stop_listening()

client = Bot()
client.run('my_token')`
Finally i have empty file and error: malloc(): corrupted top size Aborted(code dumped) I've already tried discord.py-voice-recv-mk2 and discord.py-voice-recv-mk3.zip Help pls

Did you ever get this working?

Jourdelune commented 1 year ago

Here is a link of a fork of discord.py with channel recording: https://github.com/Interaction-Bot/discord.py. I have copy the code of a old commit. A example: https://github.com/Interaction-Bot/discord.py/blob/master/examples/voice_recording.py

blastbeng commented 1 year ago

Here is a link of a fork of discord.py with channel recording: https://github.com/Interaction-Bot/discord.py. I have copy the code of a old commit. A example: https://github.com/Interaction-Bot/discord.py/blob/master/examples/voice_recording.py

Is it working good? Have you already asked for a merge on https://github.com/Rapptz/discord.py ?

Jourdelune commented 1 year ago

There is a pull request to had voice recording: https://github.com/Rapptz/discord.py/pull/9288. You can use it now (not for production)

mikeshardmind commented 1 year ago

There is a pull request to had voice recording: # 9288. You can use it now :p

I'd recommend that people not do so except in helping review and improve it. It isn't ready for production use. There's a reason the PR has as many comments on it as it does. The author of that PR has done a lot of really good work that isn't easy to do well (design considerations + discord not following spec or providing docs for what they do out of spec), and people are continually adding feedback to get it to a more user-friendly and reducing any potential for issue.

Jourdelune commented 1 year ago

I have updated my comment to be more clear.

imayhaveborkedit commented 9 months ago

I have updated the OP with the current state of the voice receive feature. It will not be added directly to the library but exist externally as an extension module. See: https://github.com/imayhaveborkedit/discord-ext-voice-recv/

Rapptz / discord.py

RFC: Voice Receive API Design/Usage #1094