gitblit-org / gitblit

pure java git solution
http://gitblit.com
Apache License 2.0
2.27k stars 671 forks source link

Incorrect encoding for raw pages #1397

Open flaix opened 2 years ago

flaix commented 2 years ago

When viewing a raw text file which has non-ascii characters like Àor é in it, the text is not correctly displayed in the browser.

Steps to reproduce:

  1. Create a text file with e.g. content À la mode.
  2. Commit text file
  3. View the commit in Gibtlit
  4. View the file of the commit -> The file content is displayed as it is in the file
  5. View the file via the raw link.

I would expect to see the text À la mode in the browser. Instead I get À la mode.

It seems that the file is shown with the windows-1252 code page, instead of the correct UTF-8 code page. When viewing the file not as raw file but on a Gitblit page, the correct encoding is used.

The web.blobEncodings setting is set to UTF-8 ISO-8859-1. This seems to be used for the web version, but the Content-Type header returned with the raw file is just text/plain. It should probably also use the encoding setting and return test/plain; charset=utf-8.

Environment: Gitblit 1.9.1 on macOS. Browser Firefox and Safari.