ebourg / jsign

Java implementation of Microsoft Authenticode for signing Windows executables, installers & scripts
https://ebourg.github.io/jsign
Apache License 2.0
250 stars 107 forks source link

Invalid signature for scripts with accented characters #123

Closed VinnyVynce closed 10 months ago

VinnyVynce commented 2 years ago

Hi! I started using jsign recently in our workflow and first I want to say thanks for all the work that went through the project, thanks! About half of our powershell scripts failed with an invalid digital signature error and I think I found the issue. It seems with the current version, version 4.0, we cannot sign any powershell that includes accended characters. Here's sample code:

# Comment with french accents: àéèù
Write-Output "Working?"

EDIT: The problem also happen in VBScript:

Set objShell = CreateObject("WScript.Shell")

' Comment with french accents: àéèù
objShell.Echo "Working?"

Removing the french accents characters resolve the issue and grants us a correctly signed script however removing comments is not really a solution. I've forced UTF8 but it dosen't fix the issue either. Opening back the file with Powershell ISE let us see some data corruption in the file as well: the accents becomes àéèù which looks like bad encoding at some point in the process.
Thanks!

VinnyVynce commented 2 years ago

Okay, I found a fix, my files were saved in UTF8, however saving them with UTF8-BOM and signing them afterward fixed the issue. The files show the accented characters correctly in UTF8 but once it's signed it mess them up and I don't think it should. Edit: While a UTF8-BOM VBScript produce a valid certificate, it dosen't seems to be supported and crash on the first character of the script.

ebourg commented 2 years ago

Did you set the --encoding parameter? If the parameter isn't specified, jsign honors the BOM or assumes it's UTF-8 encoded.

VinnyVynce commented 2 years ago

Hi @ebourg, Yes, I've tried with --encoding UTF-8 and it still leads to an invalid digital signature if I do so.

java -jar ~/jsign-4.0.jar --keystore certificates.pfx --storepass ******** --replace -d SHA-256 --encoding UTF-8 myScript.ps1

Doing the exact same command but this time with my script being encoded in UTF8-BOM leads to a valid digital signature.

ebourg commented 2 years ago

Did you try with --encoding ISO-8859-1 ?

VinnyVynce commented 2 years ago

No I didn't try that so I gave it a shot. With my file being saved in UTF8 adding --encoding ISO-8859-1 leads to an invalid digital signature, the accented characters unaffected. Changing the file encoding to ISO 8859-1 then signing with the --encoding ISO-8859-1 argument leads to a valid certificate but accented characters are affected.

ebourg commented 2 years ago

So if the file is encoded in UTF-8 without BOM and signed with --encoding UTF-8 (or nothing), the verification fails, correct? I'll check that

VinnyVynce commented 2 years ago

Yes but the file do need accented characters in it (can be comments or variables data). The original message has a working example that should lead to an invalid signature. When I save the file with UTF8-BOM and sign it, then it works with accented characters. However like I said, VBS dosen't seem to support BOM at all.

ebourg commented 1 year ago

I've tried signing an UTF-8 encoded .vbs file with no BOM:

MsgBox "Halló heimur!"

If --encoding isn't specified the signature is invalid:

jsign --keystore keystore.jks --keypass password hello-world-utf8.vbs

If --encoding UTF-8 is specified the signature is valid:

jsign --keystore keystore.jks --keypass password --encoding UTF-8 hello-world-utf8.vbs

This case is covered by the testSignUTF8 unit test in ScriptSignerTest, but I wanted to double check from the command line.

If you could send to ebourg@apache.org a sample file that fails to sign I'll investigate further.

ghost commented 1 year ago

Hi,

I think I have a similar problem when signing JavaScripts!

Example file that I got "This digital signature is not valid" when signing: Download the JavaScript file and rename file extension to .js sv.txt

ebourg commented 1 year ago

@Tomas-DevOps Did you set the --encoding parameter?

ebourg commented 1 year ago

@Tomas-DevOps I've been able to reproduce the issue, even with --encoding UTF-8 the signature is invalid.

ebourg commented 10 months ago

@VinnyVynce @Tomas-DevOps Could you try with the parameter --encoding windows-1252?

signtool assumes .js and .vbs files (without byte order marks) to be encoded in Windows-1252, even if the file is actually in UTF-8. Jsign on the other hand assumes the encoding to be ISO-8859-1, but that's not equivalent to Windows-1252. The 0x7F-0x9F range is undefined in ISO-8859-1, and if an UTF-8 character is encoded with one of these bytes, Jsign sees a different character than signtool.

For example the sv.txt file contains the Unicode character '‰' (U+2030) which is encoded in UTF-8 as 0xE2 0x80 0xB0. The middle byte 0x80 turns into a control character in ISO-8859-1 (then converted to 0x8000 in UTF-16LE when hashing) and into the € symbol in Windows-1252 (converted to 0xAC20 in UTF-16LE). So this leads to different hashes.

Jsign could be changed to use Windows-1252 instead of ISO-8859-1. But an issue remains with the values undefined in Windows-1252: 0x81, 0x8D, 0x8F, 0x90 and 0x9D.

For example the character 'ẍ' (U+1E8D) is encoded in UTF-8 as 0xE1 0xBA 0x8D. With the encoding set to Windows-1252 Jsign turns the last byte into a replacement character (U+FFFD), while signtool keeps it as is and hashes 0x8D00 instead. Ironically, when the encoding parameter is set to ISO-8858-1, Jsign handles this character properly.

So the right solution seems to write a custom charset based on Windows-1252, but preserving the undefined values like ISO-8859-1 instead of turning them into replacement characters.

ebourg commented 10 months ago

Another point, the default charset used by signtool depends on the locale of the system and isn't guaranteed to be Windows-1252. The validity of a signature is locale dependent. A script with non ASCII characters signed on a Western European system will be invalid on a Cyrillic system. The only way to prevent this is to have a script with byte order marks at the beginning.