Open mitar opened 8 months ago
What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?
That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.
What happen if I build a program using this feature on two different machine with two different databases ? Does that means theses builds are not byte for byte reproducible anymore ?
Yes, the database becomes then your compilation time dependency. So they are reproducible if you have all dependencies the same.
Pure go programs don't currently do this they are clear about what the input to the build (modules and your go toolchain). And I personally really dislike this about most C toolchains.
That a huge :-1: for me.
That sounds like this could be solved by vendoring a generic enough database in the std and using that for builds, however then if we do that, there is no particular reason why this has to be done at build time for the usecase you describe to work.
That could work. In a similar way time/tzdata
is done.
I think this is a better solution, after checking there already exists a small table: https://github.com/golang/go/blob/e8b5bc63be22e2bebffabcfccaf54d4c19822fe6/src/mime/type.go#L60-L77
I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).
I think you should make a new proposal to extend this one (maybe seed it from some existing third party authority like the unicode tables are found).
I do not think we should extend it for everyone, but allow opt-in by importing something like mime/data
.
Maybe. I think you need an argument to why one more configuration knob is needed. If it could just be done, that better imo.
/etc/mime.types
is 24 KB on my machine. I think including that in all builds might be disliked by some?
Proposal Details
I think this is a common use case: we use embed to embed some files at compile time and then we serve them at runtime using HTTP. But to serve files at runtime, one has to also know the media/mime type of the file, to set
Content-Type
correctly. One way to do that is to use standardmime.TypeByExtension
on the embedded filenames. This works well if filenames have known file extensions and developer can generally predict the outcome if they notice a file extension is missing or if the file extension is an obscure one.But what is hard to observe by a developer is that
mime.TypeByExtension
requires system installation of a corresponding database. If that database is missing, then behavior on their local machine might differ from the production machine. Even worse, the behavior might be different if the database is different (e.g., locally one uses Ubuntu and production uses Alpine).To make production deployment work well, we use static builds. But sadly, because of the dependency on this database, even static build can behave differently.
Proposal
I propose that embedded files with
embed.FS
detect the media/mime type at compile time and expose it in some way to the developer (I am open to how exactly this is implemented). The critical point here is that this is done at compile where developer has easier time controlling the environment, so that the behavior is the same wherever the binary is later on deployed/ran.Alternative proposal
Maybe there should be a way to embed currently available mime/media type database into the binary at compile time.