zzzprojects / html-agility-pack

Html Agility Pack (HAP) is a free and open-source HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT. It is a .NET code library that allows you to parse "out of the web" HTML files.
https://html-agility-pack.net
MIT License
2.65k stars 375 forks source link

Fix copyright symbol encoding; force UTF-8-BOM encoding for C# files #526

Closed n-ski closed 11 months ago

n-ski commented 11 months ago

Issue

When opening some files in IDEs like Visual Studio or Rider, the developer gets notified that file was opened with UTF-8 encoding. In case of Visual Studio, it tells you that some characters were replaced with substitution character (�) and saving the file will not preserve the original contents; Rider also performs character replacement but preserves the original contents when file is saved.

Visual Studio 2022 ![](https://github.com/zzzprojects/html-agility-pack/assets/57114830/498744ea-d9f9-4872-b435-7684370b5660) Shows up as modal window, forces you to interact with it.
Rider ![](https://github.com/zzzprojects/html-agility-pack/assets/57114830/6b9afde2-3fd7-47ec-ac35-7bac68980783) Shows up as notification in the notification bar above the code editor, can be ignored.

The reason why it's happening is that some files are encoded in ISO-8859-1 where the copyright symbol is a single byte (0xA9), which is incompatible with UTF-8 and forces an IDE or a text editor to perform the character substitution.

Changes

Result

The developer doesn't get annoyed with notifications/popups and in case of Visual Studio one doesn't need to be careful not to accidentally commit the substitution character.

JonathanMagnan commented 11 months ago

Hello @n-ski ,

Thank a lot for your contribution.

We will review and merge it this week.

Best Regards,

Jon

JonathanMagnan commented 11 months ago

Hello @n-ski ,

Thank you again for your contribution. Your fix is now merged.

The BOM issue might come back on a few files later today, as the current source is not actually on GitHub. I will move all our latest updates to GitHub and keep only our personal projects in our private repository instead, which makes a lot more sense.

Best Regards,

Jon