hikari-py / hikari

A Discord API wrapper for Python and asyncio built on good intentions.
https://docs.hikari-py.dev/en/stable
MIT License
807 stars 95 forks source link

Support sending voice messages #1780

Open davfsa opened 10 months ago

davfsa commented 10 months ago

Summary

Add support for setting and sending waveform in the payload when creating a message.

Should also be documented how to calculate the waveform (https://github.com/discord/discord-api-docs/pull/6082#discussion_r1167345398)

Ideal implementation

Add support for waveform and send it along with message creates

Usage example:

await bot.rest.create_message(
    attachment=hikari.File("voice.mp3"),
    waveform=calculated_waveform,  # or just ""
    flags=hikari.MessageFlag.IS_VOICE_MESSAGE
)

Checklist

hypergonial commented 9 months ago

I'm pretty sure the waveform= kwarg should be added to attachments, unless I'm mistaken.

Also, based on this, should waveform= be added to all hikari resource base types? Or should a new basetype be created?

PythonTryHard commented 6 months ago

I vouch for creating a new base type, something like VoiceMessage. Adding waveform= to all hikari resource bases sounds...iffy. I can't describe why, but intuition says it's iffy.

For implementation, I found librosa that does the downsampling in Python rather than having to offload to ffmpeg. The only pain is that librosa only does sample rate, not bit depth, though I assume with some nparray shenanigan the bit depth should be easy enough. It's just a graphical representation, not some audio nerd stuffs.

Although, considering librosa's dependencies, if we were to go with it as a "battery included", I'd rather we put it as another extra like hikari[voicemsg]. How many times do you see a Discord bot library ship with number-crunching dependencies?