discord / discord-api-docs

Official Discord API Documentation
https://discord.com/developers/docs/intro
Other
5.84k stars 1.23k forks source link

Incorrect MIME Type in Content_Type for Certain Filename Extensions #6785

Open AWATV opened 3 months ago

AWATV commented 3 months ago

Description

When retrieving message attachments via the Discord API, an issue arises with the determination of the MIME type for certain filename extensions. Specifically, when an attachment is named "somename.mp3.gz" and its content is a gzip-compressed file, the API incorrectly returns the MIME type as audio/mpeg instead of the expected application/x-gzip. This discrepancy between the expected and actual MIME types could lead to misinterpretation of file content by applications or bots relying on accurate MIME type detection.

This bug affects developers and users who utilize the Discord API to handle message attachments, particularly those dealing with compressed files. Proper MIME type identification is essential for ensuring correct processing and handling of attachments within applications or bots integrated with Discord.

Steps to Reproduce

Send a message with an attachment named "somename.mp3.gz" to a Discord server or channel. Retrieve the message using the Discord API, specifically focusing on accessing the attachment object. Check the content_type property of the attachment object.

Expected Behavior

The content_type property for an attachment named "somename.mp3.gz" should return application/x-gzip, indicating that the file is a gzip-compressed file.

Current Behavior

The content_type property for an attachment named "somename.mp3.gz" returns audio/mpeg, which is incorrect for a gzip-compressed file. This behavior is inconsistent with the expected MIME type.

Screenshots/Videos

No response

Client and System Information

Node.js modules:

API version: V14

lmle commented 3 months ago

Hey @AWATV! Can you provide us with the exact requests you're sending and responses you're getting? Thanks!

AWATV commented 3 months ago
const { Client, GatewayIntentBits } = require('discord.js');

const client = new Client({
    intents: [
      GatewayIntentBits.Guilds,
      GatewayIntentBits.GuildMessages,
      GatewayIntentBits.MessageContent,
      GatewayIntentBits.GuildMembers,
      GatewayIntentBits.DirectMessages,
    ],
  });

const token = '...';
const guildId = '1195451831149658225';
const channelId = '1196091248201695262';

client.once('ready', async () => {
    console.log('Bot is ready!');

    try {
        const guild = await client.guilds.fetch(guildId);
        if (!guild) throw new Error('Guild not found');

        const channel = guild.channels.cache.get(channelId);
        if (!channel) throw new Error('Channel not found');

        const messages = await channel.messages.fetch({ limit: 1 });
        const lastMessage = messages.first();

        if (lastMessage) {
            console.log('Last Message Attachments:', lastMessage.attachments);
        } else {
            console.log('No messages found in the channel');
        }
    } catch (error) {
        console.error('Error:', error);
    }

    client.destroy();
});

client.login(token);
Bot is ready!
Last Message Attachments: Collection(2) [Map] {
  '1225338377726070794' => Attachment {
    attachment: 'https://cdn.discordapp.com/attachments/1196091248201695262/1225338377726070794/somename.mp3.gz?ex=6620c44a&is=660e4f4a&hm=399aee161711361af26937cb41bd37d19b487bbb2a6658addc0567c6fe07c422&',
    name: 'somename.mp3.gz',
    id: '1225338377726070794',
    size: 5455565,
    url: 'https://cdn.discordapp.com/attachments/1196091248201695262/1225338377726070794/somename.mp3.gz?ex=6620c44a&is=660e4f4a&hm=399aee161711361af26937cb41bd37d19b487bbb2a6658addc0567c6fe07c422&',
    proxyURL: 'https://media.discordapp.net/attachments/1196091248201695262/1225338377726070794/somename.mp3.gz?ex=6620c44a&is=660e4f4a&hm=399aee161711361af26937cb41bd37d19b487bbb2a6658addc0567c6fe07c422&',
    height: null,
    width: null,
    contentType: 'audio/mpeg',
    description: null,
    ephemeral: false,
    duration: null,
    waveform: null,
    flags: AttachmentFlagsBitField { bitfield: 0 }
  },
  '1225338378195828806' => Attachment {
    attachment: 'https://cdn.discordapp.com/attachments/1196091248201695262/1225338378195828806/somename.mp3?ex=6620c44b&is=660e4f4b&hm=72396422a788b37340a4f599dc2b3aafd4f1c4850b4682975c856a37b7368d4c&',
    name: 'somename.mp3',
    id: '1225338378195828806',
    size: 5469184,
    url: 'https://cdn.discordapp.com/attachments/1196091248201695262/1225338378195828806/somename.mp3?ex=6620c44b&is=660e4f4b&hm=72396422a788b37340a4f599dc2b3aafd4f1c4850b4682975c856a37b7368d4c&',
    proxyURL: 'https://media.discordapp.net/attachments/1196091248201695262/1225338378195828806/somename.mp3?ex=6620c44b&is=660e4f4b&hm=72396422a788b37340a4f599dc2b3aafd4f1c4850b4682975c856a37b7368d4c&',
    height: null,
    width: null,
    contentType: 'audio/mpeg',
    description: null,
    ephemeral: false,
    duration: null,
    waveform: null,
    flags: AttachmentFlagsBitField { bitfield: 0 }
  }
}
MatthewSH commented 2 months ago

Decided to test this myself. Was curious if it was some weird thing with like it just pulling the first file extension. Here's my bot code:

const Eris = require("eris");
const dotenv = require("dotenv");
dotenv.config();

const bot = new Eris(process.env.TOKEN);

bot.on("ready", () => {
  console.log("Ready!");
});
bot.on("messageCreate", (msg) => {
  console.log(`${msg.id} > `, {
    filename: msg.attachments[0].filename,
    content_type: msg.attachments[0].content_type,
  });
});
bot.connect();

I created a text file with the content:

Test file.

I duplicated that file 14 times and added/change the extension: image

Here are the results:

Ready!
1226925635482423388 >  { filename: 'File_1.txt', content_type: 'text/plain; charset=utf-8' }
1226925650833702983 >  { filename: 'File_2.mp3', content_type: 'audio/mpeg' }
1226925691711393864 >  { filename: 'File_3.mp4', content_type: 'video/mp4' }
1226925706382938353 >  { filename: 'File_4.zip', content_type: 'application/zip' }
1226925717909016679 >  { filename: 'File_5.gz', content_type: undefined }
1226925727526420562 >  { filename: 'File_6.tar.gz', content_type: 'application/x-tar' }
1226925739572592660 >  { filename: 'File_7.jpg', content_type: 'image/jpeg' }
1226925760640454758 >  { filename: 'File_8.exe', content_type: 'application/x-msdos-program' }
1226925775018528878 >  { filename: 'File_9.txt.zip', content_type: 'application/zip' }
1226925788776108173 >  {
  filename: 'File_10.txt.gz',
  content_type: 'text/plain; charset=utf-8'
}
1226925801451163770 >  { filename: 'File_11.txt.tar.gz', content_type: 'application/x-tar' }
1226925813946122250 >  { filename: 'File_12.txt.mp3.gz', content_type: 'audio/mpeg' }
1226925825836974130 >  {
  filename: 'File_13.txt.mp3.tar.gz',
  content_type: 'application/x-tar'
}
1226925835236278354 >  { filename: 'File_14.zip.gz', content_type: 'application/zip' }

Based on this line:

 { filename: 'File_5.gz', content_type: undefined }

It doesn't know what a .gz file is so it's defaulting to the next one up.

According to Mimetype it should be registering .gz and .tgz as appplication/gzip. The .tar.gz should probably also be application/gzip.


On a side note, I wanted to test behavior for other "non-existent" file extensions Looks like the theory is correct:

1226927697427435601 >  { filename: 'File_15.notarealextensio', content_type: undefined }