trailofbits / uthenticode

A cross-platform library for verifying Authenticode signatures
https://trailofbits.github.io/uthenticode/
MIT License
133 stars 33 forks source link

Zeros embedded in subject and issuer fields when using certificates generated with makecert (e.g. self-signed for test-signing during driver development) #95

Open hugmyndakassi opened 5 months ago

hugmyndakassi commented 5 months ago

It appears as if makecert writes Windows-style wchar_t* into these fields verbatim instead of encoding them as, say, UTF-8. This is an issue as it will lead to the strings being interpreted as single-character string, given every other character is a zero byte.

I worked around this locally by creating a function that filters these:

static inline std::string filter_escaped_zero_bytes(char const* instr)
{
    auto input = std::string(instr);
    static std::string const needle = "\\x00";
    std::ostringstream os;
    std::size_t currpos = 0, oldpos{};
    do
    {
        oldpos = currpos;
        currpos = input.find(needle, currpos);
        if (std::string::npos == currpos)
        {
            break;
        }
        os << input.substr(oldpos, currpos - oldpos);
        currpos += needle.size();
    } while (true);
    os << input.substr(oldpos); // remainder
    return os.str();
}

... and using that in the ctor (Certificate::Certificate):

subject_ = filter_escaped_zero_bytes(subject.get());
issuer_ = filter_escaped_zero_bytes(issuer.get());

Not sure this is an issue you'll want to address at all, since it appears that MS and its makecert tool deviate from the standard here. Just wanted to bring this to your attention as well as to the attention of those watching the project.

Feel free to close this pretty much immediately.

PS: this isn't particularly optimized, because I only ever encountered this with those certificates created by makecert and we only use these in testing scenarios. But I'm sure there is still some room for improvement.

woodruffw commented 5 months ago

Thanks for the report! Yeah, I'm not sure if there's a great option for us here:

So, two ideas I have:

Thoughts?

woodruffw commented 5 months ago
  • The subject and issuer are both exposed as std::string intentionally, since consumers should treat them as a (ptr, len) pair that may have all kinds of weird bytes internally (such as embedded NULLs).

Ah, never mind -- I forgot that we call X509_NAME_oneline which boils it down to char *, causing all this pain. I think we need to replace that call with X509_NAME_print_ex and pass in ASN1_STRFLGS_UTF8_CONVERT, which will perform the UTF-8 conversion.