gruntwork-io / boilerplate

A tool for generating files and folders ("boilerplate") from a set of templates
https://www.gruntwork.io
Mozilla Public License 2.0
157 stars 12 forks source link

Do a better job of disambiguating between text and binary #126

Closed brikis98 closed 3 months ago

brikis98 commented 1 year ago

Describe the bug boilerplate has to distinguish between text files, which are passed through the Go templating engine, and binary files, which are copied without any changes. It tries to do this using the file command, if it's available on the computer:

https://github.com/gruntwork-io/boilerplate/blob/d323184f59f1eb937fb9b6f7fb8ffb8ba1fc9fb5/util/file.go#L41-L51

On computers without file, it falls back to Go's http.DetectContentType, which works well for the most part, but sometimes, it fails: for example, it fails on .hcl files, treating them as binary! As a result, the Go templating in the .hcl files doesn't get processed.

To Reproduce Run boilerplate on a system that doesn't have file installed and have it process a .hcl file in the --template-folder.

Expected behavior boilerplate should:

  1. Ideal solution: detect all text vs binary files correctly. Is there a better Go library we can use?
  2. Alternative: at the very least, we should improve logging:
    1. Log if we think the file is text (and templating will be processed) or binary (so it's copied unchanged), as that will help users understand what's going on.
    2. Log if file is not installed and we're falling back to http.DetectContentType.

Additional context Originally reported in https://gruntwork-io.slack.com/archives/CJ39EV0KW/p1669017123927079.

HoushCE29 commented 5 months ago

Any plans on fixing this? This seems to be an issue especially on Windows for even the simplest of text files.. Perhaps if there's no "better" way, then this can be overridable via configuration? E.g. a flag to ignore the binary/text file check if all we're doing is templating text files, or even something more granular such as categorizing files manually into binary vs textual

rwittrick commented 3 months ago

Additionally, on some OS, e.g. Ubuntu 26 , running file -b --mime some.js returns a javascript mime type application/javascript; charset=us-ascii so js/ts is currently excluded via https://github.com/gruntwork-io/boilerplate/blob/441f030657d69b7da4113e0b7963b30a6bc7b455/util/file.go#L38

brikis98 commented 3 months ago

@denis256 Do you think you could look into this? The ideal solution would be some Go library that does a good job of disambiguating between binary & text files.

denis256 commented 3 months ago

Will take a look

denis256 commented 3 months ago

Released improvement in https://github.com/gruntwork-io/boilerplate/releases/tag/v0.5.13

rwittrick commented 3 months ago

Thanks, this sorted my issue