excubo-ag / WebCompiler

Apache License 2.0
151 stars 30 forks source link

File encoding for existing files is not being retained #33

Closed karan-kang closed 1 year ago

karan-kang commented 3 years ago

The web compiler is changing the file encoding from UTF-8 to UTF-8 with BOM even when no other changes are made to the file. This causes the files to show up in source control without any detected changes.

Here are some recommendations:

  1. Preserve the encoding of existing files by default
  2. Use UTF-8 BOM for any newly generated files with an option to override the default behavior via the JSON configuration

**webcompilerconfiguration.json***

{
  "Minifiers": {
    "GZip": false,
    "Enabled": true,
    "Css": {
      "CommentMode": "Important",
      "ColorNames": "Hex",
      "TermSemicolons": true,
      "OutputMode": "SingleLine",
      "IndentSize": 2
    },
    "Javascript": {
      "RenameLocals": false,
      "PreserveImportantComments": true,
      "EvalTreatment": "Ignore",
      "TermSemicolons": true,
      "OutputMode": "SingleLine",
      "IndentSize": 2
    }
  },
  "Autoprefix": {
    "Enabled": false,
    "ProcessingOptions": {
      "Browsers": [
        "last 4 versions"
      ],
      "Cascade": true,
      "Add": true,
      "Remove": true,
      "Supports": true,
      "Flexbox": "None",
      "Grid": "None",
      "IgnoreUnknownVersion": false,
      "Stats": "",
      "SourceMap": false,
      "InlineSourceMap": false,
      "SourceMapIncludeContents": false,
      "OmitSourceMapUrl": false
    }
  },
  "CompilerSettings": {
    "Sass": {
      "IndentType": "Space",
      "IndentWidth": 2,
      "OutputStyle": "Nested",
      "Precision": 5,
      "RelativeUrls": true,
      "LineFeed": "Lf",
      "SourceMap": false
    }
  },
  "Output": {
    "Preserve": true
  }
}

I see that in the current source, you are compiling the files with UTF-8 BOM currently when dealing with existing files: https://github.com/excubo-ag/WebCompiler/blob/4d8a7234ceeca19fc8dd386f7d3ef43061849041/WebCompiler/Compile/Compiler.cs#L10

I think we should preserve the encoding of existing files to avoid this change: https://github.com/excubo-ag/WebCompiler/blob/4d8a7234ceeca19fc8dd386f7d3ef43061849041/WebCompiler/Compile/Compiler.cs#L19

stefanloerwald commented 3 years ago

Hi @karan-kang,

I share your frustration of "changes" to files in version control that are just encoding differences. That really is annoying. However, I think this should be avoided in a different way entirely! I would remove the generated files from git. Only source files should be contained in the repo. This also helps with merging between branches: generated files, especially when dealing with minimized files, behave badly under merge, as the diffing algorithms have a harder time finding the semantic difference.

Should this not be possible for your scenario, I'd be happy to review a PR on this topic. It shouldn't be too hard to change the behavior to "if file exists, keep encoding, otherwise use UTF8 with BOM". I don't think it needs to be configurable.

BR Stefan

karan-kang commented 3 years ago

@stefanloerwald Thanks for the feedback! Unfortunately for our internal use case, it's not possible to skip these files from git source control. I will submit a PR for this in the coming weeks.