integrated-application-development / sonar-delphi

Delphi language plugin for SonarQube
GNU Lesser General Public License v3.0
104 stars 17 forks source link

File encoding mismatch between SonarQube and sonar-delphi APIs #136

Closed zaneduffield closed 11 months ago

zaneduffield commented 11 months ago

Prerequisites

SonarDelphi version

1.0.0

SonarQube version

No response

Issue description

Currently, sonar-delphi simply uses the filesystem-level encoding provided by the sonar-scanner engine. However, the sonar-scanner engine actually determines file encodings at the file level, using heuristics:

https://github.com/SonarSource/sonarqube/blob/f50873318397d4bf7ba6a5c2b194dfa02492bdae/sonar-scanner-engine/src/main/java/org/sonar/scanner/scan/filesystem/MetadataGenerator.java#L53 https://github.com/SonarSource/sonarqube/blob/f50873318397d4bf7ba6a5c2b194dfa02492bdae/sonar-scanner-engine/src/main/java/org/sonar/scanner/scan/filesystem/ByteCharsetDetector.java#L48

This encoding is then saved on the InputFile which can be accessed via InputFile::charset

This can lead to errors in file offsets, when sonar-delphi is decoding the file using a different charset than the sonar-scanner engine.

Steps to reproduce

Scan the following (utf-8 encoded) source file

program LineOffsetBug;

begin
                           // LEcranPrincipal doit être la 1ère vraie form créée pour être la main form fenêtre principale
end.

with sonar.sourceEncoding set to windows-1252.

Minimal Delphi code exhibiting the issue

No response

Cirras commented 11 months ago

In addition, SonarDelphi never discards the user's provided encoding - even when there's a BOM clearly indicating the encoding of the file. Another detail that's inconsistent with sonar-scanner-engine and should be changed.