discord / discord-api-docs

Official Discord API Documentation
https://discord.com/developers/docs/intro
Other
5.96k stars 1.27k forks source link

`waveform` And `duration_sec` Ignored When Sending An Interaction Response With A Voice Message #7153

Closed Icebluewolf closed 1 month ago

Icebluewolf commented 1 month ago

Description

When responding to an interaction by sending a message that contains a audio file and is marked as a Voice Message the waveform and duration_sec are ignored.

This only seems to happen on interaction responses. When discord returns the message object with attachment information these fields are excluded from the response.

Steps to Reproduce

Create a slash command that send a audio file. Ensure that you use the message flag for voice messages (8192). Provide a waveform and duraction_sec for the audio file.

The raw response should look similar to this

b'--3f8b...\r\n'
b'Content-Type: text/plain; charset=utf-8\r\nContent-Disposition: form-data; name="payload_json"\r\n\r\n'
b'{"type":4,"data":{"tts":false,"flags":8192,"allowed_mentions":null},"attachments":[{"id":0,"filename":"sound_effect.wav","description":"Desc lol","waveform":"base64 encoded string here","duration_secs":60.0}]}'
b'\r\n'
b'--3f8b...\r\n'
b'Content-Type: application/octet-stream\r\nContent-Disposition: form-data; name="files[0]"; filename="sound_effect.wav"\r\n\r\n'
b'RIFFFd\x1d\x00WAVEfmt \x10\x00\x00\x00\x00\x02\x00\x80\xbb\x00\x00\xee\x02\x00\x04\x00\x10\x00LIST\x1a\x00\x00INFOISFT\r\x00\x00\x00Lavf61.1.100\x00\x00data\x00d\x1d\x00\x00\x00\x00\x00 ......'
b'\r\n'
b'--3f8b...--\r\n'

Expected Behavior

The audio file is a voice message with the provided waveform and duraction_sec

Current Behavior

The audio file IS a voice message but does not use the provided waveform or duration_sec. Rather it acts as if they were not provided. This is what the bot receives from the gateway when the message is sent. Note that waveform and duration_sec are not included.

{'t': 'MESSAGE_CREATE', 's': 6, 'op': 0, 'd': {'webhook_id': '71495476...', 'type': 20, 'tts': False, 'timestamp': '2024-09-17T18:55:42.854000+00:00', 'position': 0, 'pinned': False, 'nonce': '128567556...', 'mentions': [], 'mention_roles': [], 'mention_everyone': False, 'member': {'roles': ['678366...'], 'premium_since': None, 'pending': False, 'nick': None, 'mute': False, 'joined_at': '2024-09-13T23:22:55.825000+00:00', 'flags': 0, 'deaf': False, 'communication_disabled_until': None, 'banner': None, 'avatar': None}, 'interaction_metadata': {'user': {'username': '...', 'public_flags': 419..., 'id': '4518481...', 'global_name': '...', 'discriminator': '0', 'clan': None, 'avatar_decoration_data': None, 'avatar': 'c3c4...'}, 'type': 2, 'name': 'send_voice', 'id': '12856...', 'command_type': 1, 'authorizing_integration_owners': {'0': '6783599...'}}, 'interaction': {'user': {'username': '...', 'public_flags': 419..., 'id': '45184...', 'global_name': '...', 'discriminator': '0', 'clan': None, 'avatar_decoration_data': None, 'avatar': 'c3c41c...'}, 'type': 2, 'name': 'send_voice', 'member': {'roles': ['678366...', '7638...', '763874845...', '7638...'], 'premium_since': None, 'pending': False, 'nick': None, 'mute': False, 'joined_at': '2020-10-08T20:47:51.601000+00:00', 'flags': 0, 'deaf': False, 'communication_disabled_until': None, 'banner': None, 'avatar': None}, 'id': '1285675...'}, 'id': '1285675...', 'flags': 8192, 'embeds': [], 'edited_timestamp': None, 'content': '', 'components': [], 'channel_id': '977181...', 'author': {'username': '...', 'public_flags': 0, 'id': '7149547653...', 'global_name': None, 'discriminator': '12...', 'clan': None, 'bot': True, 'avatar_decoration_data': None, 'avatar': '844174...'}, 'attachments': [{'url': 'https://cdn.discordapp.com/attachments/977.../128567.../sound_effect.wav?ex=222e&is=66e&hm=2b3c99916c041bed35bf71c7587224a17ae404&', 'size': 1926222, 'proxy_url': 'https://media.discordapp.net/attachments/9771.../1285.../sound_effect.wav?ex=66e&is=66ee&hm=2b3ce6c17aedde04&', 'id': '12856...', 'filename': 'sound_effect.wav', 'content_type': 'audio/x-wav', 'content_scan_version': 0}], 'application_id': '71...', 'guild_id': '6...'}}

Screenshots/Videos

The bottom was sent to the channel directly. image

Client and System Information

Edition Windows 11 Home Version 23H2 Installed on ‎6/‎23/‎2024 OS build 22631.4169 Experience Windows Feature Experience Pack 1000.22700.1034.0

Using py-cord library in Python

Discord Info: canary 327592 (452e031) Host 1.0.449 x64 (52522) Build Override: N/A Windows 11 64-bit (10.0.22631)

tpcstld commented 1 month ago

Would you mind providing a channel ID and message ID for the interaction response? Would help us out a little. :)

Icebluewolf commented 1 month ago

Of the message in the screenshot?

If so: Channel: 977181204916867163 Message: 1285682504149242000

DV8FromTheWorld commented 1 month ago

If you are going to provide the content type, you need to provide the correct content type. Content-Type: application/octet-stream is a generic octet stream but doesn't tell us anything about what is coming over the wire.

For voice messages, you will need to provide the audio/--- content type, like audio/ogg or audio/wav, otherwise duration_secs and waveform will be stripped.

This looks like a gap in the docs; We'd be happy to accept clarifying contributions to the docs to address these holes.