digitalbazaar / forge

A native implementation of TLS in Javascript and tools to write crypto-based and network-heavy webapps
https://digitalbazaar.com/
Other
5.07k stars 784 forks source link

bad utf-8 encoding when reading a pem certificate #847

Open iamtxena opened 3 years ago

iamtxena commented 3 years ago

Hello, I am trying to generate a CMS signature and my code is working when using a x509 certificate with no special characters. However, when using one that contains accents in the issuer attributes. It fails the verification because the issuer name does not match.

This is because the resulting issuer text encoding is not properly done.

I attach the given certificate that fails. Just rename it as .pem certificate.txt

When using openssl to read this certificate:

$ openssl x509 -noout -text -in tests/data/certificate.pem

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            2e:0c:c0:ec:99:62:a3:a4:38:b5:92:88:fa:87:da:a7:21:ea:24:88
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = ES, ST = CAT, L = Barcelona, O = Internet Widgits Pty Ltd, OU = ValidatedId, CN = C\C3\A0nary
...

As you can see the CN contains two UTF-8 characters \C3\A0 to represent the à character.

When reading the pem certificate using this library ( "node-forge": "^0.10.0",) and printing the issuer I got it wrong:

const x509ForgeCert = pki.certificateFromPem(pemCert);
console.log(x509ForgeCert.issuer.getField("CN"));
{
      type: '2.5.4.3',
      value: 'CÃ nary',
      valueTagClass: 12,
      name: 'commonName',
      shortName: 'CN'
 }

And when using the node crypto v15.10 it prints the expected well coded string:

const x509cert = new X509Certificate(pemCert);
const { issuer } = x509cert;
console.log(issuer);
    C=ES
    ST=CAT
    L=Barcelona
    O=Internet Widgits Pty Ltd
    OU=ValidatedId
    CN=Cànary

To pass the validation, I have to use the issuer parsed from node crypto.

Is there a way to force this library to code correctly the given string as UTF-8?

deltazero-cz commented 2 years ago

Similar issue reading pkcs12.

Bag values (Subject & Issuer CommonName, for example), are encoded in latin1, easily translatable to utf8.

However, cert's FriendlyName is already in utf8, but incorrectly. I.e. "Èeská Republika" instead of "Česká republika"

mrprokes commented 7 months ago

I had the same problem trying to make PKCS#7 encryption.

My solution was after loading the certificate loop through issuer fields, read it as ASCII and write it back as UTF-8 like that: certificate.issuer.attributes.forEach(element => { element.value = Buffer.from(Buffer.from(element.value, 'ASCII')).toString('utf-8'); });

The problem with certificateFromPem() function is that it uses the obsolete method of reading base64 see: var msg = { type: type, procType: null, contentDomain: null, dekInfo: null, headers: [], body: forge.util.decode64(match[3]) };

I hope it helps.