sindresorhus / file-type

Detect the file type of a file, stream, or data
MIT License
3.64k stars 345 forks source link

proposal: expose mime-type <-> extension relations #598

Open xobotyi opened 1 year ago

xobotyi commented 1 year ago

As for now - sets of supported mimetypes and extensions exposed, but not relation between them.

I think it will be useful to expose maps that will expose relation between extension and mimetype and vice-versa.


export const TypeExtensions = new Map<string, []string>();

export const ExtensionTypes= new Map<string, []string>();
sindresorhus commented 1 year ago

What's the use-case?

xobotyi commented 1 year ago

So basically we have two-stage file upload 1) user uploads file to the storage where it has it's type detection done by file-type and etc, file-storage holding only detected mime-type, but not the extension, since it is useless for some of our costom binary formats 2) it gets to the form's field where restrictions being set and lately checked, after form being sent.

In the control panel, where restrictions for fields are configured - we have a dropbox with a list of extensions that are supported by file-type + some extensions for our custom types.

Previously we've been using mime package as the point of truth about extension-to-mime relation, but there were lots of deviation between what tile-type detected especially regarding rar archives, as it can have several valid mime-types regarding it's extension (and it is not the only such type).

We could store the extension detected by file-type, along with other file info but it would be and excessive waste of storage since all the info is there, but not exposed🤷‍♂️

xobotyi commented 1 year ago

also, (i was planning to make another issue for that) - it would be great if the minimum sample size would be exposed for each mime-type, so developers would have on option to reduce amount of data transfered to ensure content type.

sindresorhus commented 1 year ago

The mapping cannot work in all cases though. There may be multiple extension for a single MIME type.

I think what we can do is to expose all the data and let it be up to you to create a mapping if you need it.

Currently, we expose: https://github.com/sindresorhus/file-type/blob/main/supported.js

We could also expose an array of supported types, where each type is an object with certain properties:

[
    {
        mimeType: 'image/jpeg',
        extension: 'jpg',
        minimumBytes: 4
    },
    ...
]

We could also use this to expose the minimum required bytes.

xobotyi commented 1 year ago

Second solution - wil list of all detectable mime and extensions is most versatile one i suppose.

i understand that mime <-> ext isnt one-to-one relation, therefore variant with exposing detectable pairs wold be the best i think.

thanks for the library by the way - it saved me crapton of effort at the moment

Borewit commented 1 year ago

Note that the minimum required bytes, as they appear in the comments are in many cases only used for the initial identification. A second identification which may require an undetermined number of bytes may follow. There is also a single case which is using recursion (id3 header).

Regarding customising the returned file types (eg extension). I think we should ask the question: does this really add value, or are we just hiding some extra code a user could add outside file-type to achieve the same thing.