danielpalme / ReportGenerator

ReportGenerator converts coverage reports generated by coverlet, OpenCover, dotCover, Visual Studio, NCover, Cobertura, JaCoCo, Clover, gcov or lcov into human readable reports in various formats.
https://reportgenerator.io
Apache License 2.0
2.56k stars 279 forks source link

Code Coverage report response causes garbled characters #652

Closed vinayakmsft closed 5 months ago

vinayakmsft commented 6 months ago

There is an API (https://dev.azure.com/shkinosh/6f274600-4e62-414a-b9dc-e51cfa02f0ec/_apis/test/CodeCoverage/browse/30505214/Code%20Coverage%20Report_995/._py.html) which responds without specifying a charset in the Content-Type header of the response.

As a result, the encoding in this iFrame is determined by the browser's guess, and depending on the case, an inappropriate encoding may be used, and the source code displayed in the Code Coverage screen may be garbled if it uses non-ASCII characters.

Garbled characters in code coverage tab:

image

Implemented simple python application to publish Cobertura format code coverage report. With this repro pipeline, I have confirmed that the Japanese characters are garbled even though it is encoded using UTF-8.

image

This issue can be resolved by overwriting response header to include "charset=utf-8" in Content-Type header.

image image

I have confirmed the following two similar reports that have been made so far, but it seems that it was mistakenly understood that the cause of the garbling was "use of multibyte characters that use an encoding other than UTF-8".

In fact, this issue occurs even if the HTML delivered contains strings encoded in UTF-8. the problem occurs because the charset is not specified in the header. Azure DevOps is fully UTF-8 encoding compatible, so I think it's reasonable to expect the strings in UTF-8 be displayed without any problem in this Code Coverage screen.

Is it possible to make a change to specify a charset in the Content-Type header of the response of the above API? Or is there some reason that I am not aware of why the charset is not specified?

vinayakmsft commented 6 months ago

Hi @danielpalme please let me know on this issue

danielpalme commented 6 months ago

@vinayakmsft I will come back to you within the next days.

danielpalme commented 6 months ago

@vinayakmsft ReportGenerator generates the HTML files (UTF8 encoded).

Then these files are probably published via the PublishCodeCoverageResults task within Azure DevOps.

The report can be accessed via the "Code Coverage" tab. Or you can download the corresponding artifact and view the report locally: image If you download the artifact and view the HTML files locally, does it get displayed correctly?

I can't change the behavior of Azure DevOps. If it does not send the "charset=utf-8" in Content-Type header, I can't change that. I can only change the generated HTML files.