golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.92k stars 17.65k forks source link

proposal: embed: embed and expose also media/mime type #66121

Open mitar opened 8 months ago

mitar commented 8 months ago

Proposal Details

I think this is a common use case: we use embed to embed some files at compile time and then we serve them at runtime using HTTP. But to serve files at runtime, one has to also know the media/mime type of the file, to set Content-Type correctly. One way to do that is to use standard mime.TypeByExtension on the embedded filenames. This works well if filenames have known file extensions and developer can generally predict the outcome if they notice a file extension is missing or if the file extension is an obscure one.

But what is hard to observe by a developer is that mime.TypeByExtension requires system installation of a corresponding database. If that database is missing, then behavior on their local machine might differ from the production machine. Even worse, the behavior might be different if the database is different (e.g., locally one uses Ubuntu and production uses Alpine).

To make production deployment work well, we use static builds. But sadly, because of the dependency on this database, even static build can behave differently.

Proposal

I propose that embedded files with embed.FS detect the media/mime type at compile time and expose it in some way to the developer (I am open to how exactly this is implemented). The critical point here is that this is done at compile where developer has easier time controlling the environment, so that the behavior is the same wherever the binary is later on deployed/ran.

Alternative proposal

Maybe there should be a way to embed currently available mime/media type database into the binary at compile time.

Jorropo commented 8 months ago

What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?

That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.

mitar commented 8 months ago

What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?

Yes, the database becomes then your compilation time dependency. So they are reproducible if you have all dependencies the same.

Jorropo commented 8 months ago

Pure go programs don't currently do this they are clear about what the input to the build (modules and your go toolchain). And I personally really dislike this about most C toolchains.

That a huge :-1: for me.

mitar commented 8 months ago

That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.

That could work. In a similar way time/tzdata is done.

Jorropo commented 8 months ago

I think this is a better solution, after checking there already exists a small table: https://github.com/golang/go/blob/e8b5bc63be22e2bebffabcfccaf54d4c19822fe6/src/mime/type.go#L60-L77

I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).

mitar commented 8 months ago

I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).

I do not think we should extend it for everyone, but allow opt-in by importing something like mime/data.

Jorropo commented 8 months ago

Maybe. I think you need an argument to why one more configuration knob is needed. If it could just be done, that better imo.

mitar commented 8 months ago

/etc/mime.types is 24 KB on my machine. I think including that in all builds might be disliked by some?