[Feature]: Support hardware-based client certificate authentication via public cert and signature callback

🚀 Feature Request

Enhance Playwright's client certificate authentication to support user-made integration with smart card middleware (like OpenSC) by accepting a public certificate (the actual certificate, not a file) and a callback to request signatures instead of always requiring direct access to private keys or passphrases.

Example

This would allow secure, hardware-based authentication while maintaining the security of private keys on smart cards and other hardware tokens.

Motivation

This is a follow-up to my request from last week #32200 and I really appreciate the work by @mxschmitt on this.

It's possible that what I have said may be interpreted as a duplicate request to #32003 but please note that the thing we're really after that is relevant to Playwright is nowhere even on the same continent as blockchain, let alone ballpark. So I think the trigger may have been pulled to axe that request a little too hastily. As I understand it, smart cards using this type of authentication via signatures instead of exposing private keys dates back to the 1990s at least. This is very old, very traditional, well-established mainstream technology, not newfangled blockchain chicanery.

But it's also technology that I'm very new to personally so I may be stating some things wrong. Please bear with me on that.

I think that what is being asked here really isn't specific to smart cards: I think it's really for any time authentication via client certificate is needed without directly exposing the private key, whether the thing concealing the private key is a smart card or another remote system or anything else it might be. So the Playwright API doesn't need to know about smart cards, but I think it does need to know about signatures. (Because it doesn't always need to know the private key, or at least not directly)

I think the project I'm needing this for would have been way simpler if there was just some way to automate making the mouse click on that "Select certificate" dialog box and letting the browser handle the actual TLS authentication rather than having Playwright handle it. If the process worked that way then I wouldn't have been requesting this stuff.

I don't mean to complain: I just was trying to think of alternative approaches to doing Playwright automated browser operations on a site with smart card authentication.

I don"t know if I made this clear enough in my request, but Playwright should not be building the integrations with specific smart card middleware. Wherher it's OpenSC or Microsoft's own Windows smart card support or any other middleware, Playwright should just provide the minimum necessary API to use any one of them and let the users implement their own integrations based on the specific needs of their apps.

I wrote the following high level summary to explain the problem I'm working on and how this issue affects it. Even though I didn't originally write this to be specific to Playwright, I'm posting it here because I believe it provides valuable context as to the goals/motivations behind this feature request and how the feature would be expected to work, if implemented in Playwright:

I'm trying to get browser automation software to do hardware-based authentication for a protected site using OpenSC.

To design the script for this, a clear understanding is required of exactly which operations need to be handled by the browser, the browser automation software, the smart card middleware and most important of all, which operations need to be handled by the actual physical chip on the actual physical smart card which can't be done anywhere else without defeating the security which is provided by the smart card. We can't hand-wave any of that and still get a coherent design.

We could swap out Windows for Linux for our OS or swap out Chromium for Firefox for our browser or swap out Playwright for Puppeteer for our browser automation software or swap out Typescript for Python for our scripting language or swap out OpenSC for Windows native smart card support for our smart card middleware, but the same operations would need to be done using the same information at the same levels of whichever tech stack we choose. I believe all of those would be the components involved in any tech stack to script this.

Assuming we're using OpenSC, this is my current understanding of the process and the OpenSC commands necessary as of 2024-08-21:

First, we list available smart card readers with: opensc-tool --list-readers

Second, once we've selected a smart card to use, we would list the certificates on the card with pkcs15-tool --reader <reader number> --list-certificates and retrieve the public certificate for PIV with pkcs15-tool --reader <reader number> --read-certificate <cert number> and we can optionally add a parameter --output <file path> to output to a .PEM file xor we can just get the certificate straight from the console output of that command. Claude (the LLM) says that the Powershell for finding just the number of the PIV cert using regex replacement would be, $pivCertId = (pkcs15-tool --reader <reader number> --list-certificates | Select-String "PIV Authentication" -Context 1,0) -replace '.*ID\s*:\s*(\w+).*', '$1'

Third, we would need to identify the number of the key associated with the PIV public cert on the smart card. We would do this by using the OpenSC command pkcs15-tool --reader <reader number> --list-keys. Claude says that the Powershell to identify the number of the key associated with the previously identified PIV cert using regex replacement would be $pivKeyId = (pkcs15-tool --reader <reader number> --list-keys | Select-String -Context 0,5 "ID\s*:\s*$pivCertId") -replace '.*ID\s*:\s*(\w+).*', '$1'

Fourth, during the TLS handshake, the server is going to request a signature, which is some data that should be signed using the private key on the smart card to prove that we do have the private key (within the smart card) even though we're not going to directly expose or transmit the private key. Signing is a conceptually distinct / separate operation from encrypting because its purpose is not to conceal the information being signed but is instead just to provide proof that we have the private key without actually sharing the private key. The chip on the smart card is going to sign the data and give us the signature without actually telling us the private key it is going to use. That's why we're referencing a number to identify which key instead of directly referencing the actual key: because only the chip on the smart card has the actual private key. We don't have direct access to the private key and neither should the browser automation software nor even the smart card middleware. We will need to provide the PIN to the smart card (but not to the server) in order to prove to the card that we're authorized to sign data. This indirect use of the private key to sign data through the smart card without direct exposure of the private key is how smart cards provide security. Any direct exposure of the private key would defeat the security provided by the smart card. The data to be signed is usually expected to consist of all the previous messages in the handshake process. It can't be calculated in advance of making the TLS handshake request and would need to be implemented as a callback from the browser automation software because we can't know in advance what the data to be signed will be, since that is decided by the server. The relevant OpenSC command to request the signature from the smart card would be: pkcs15-crypt --reader <reader number> --sign --pin <PIN> --key <key_id> --input <hash_in_hex>

These two inputs of the public certificate and the signature would be what I would expect the API of the browser automation software (probably Playwright) to accept for doing this kind of hardware-based TLS authentication. The public cert could be accepted either as a string or as a data stream/buffer that I could generate from a string. The signature would need to be provided by a callback function because the server gets to decide what the data to be signed will be during the TLS handshake process, preventing us from calculating it in advance.

While Playwright has recently implemented client certificates for TLS authentication, their current API for this feature requires directly exposing the private key, making it fundamentally incompatible with hardware-based authentication methods.

I thought of a good metaphor:

Exposing the private key stored on a smart card would be like exposing the core of a nuclear reactor. It's just fundamentally not how the device is supposed to work. Not ever. Not even once.

We get electricity from the nuclear power plant like we get a signature from the smart card, but we don't want what's actually inside these devices to get out.

Thanks for taking time to pinpoint these issues.

It's possible that what I have said may be interpreted as a duplicate request to #32003 but please note that the thing we're really after that is relevant to Playwright is nowhere even on the same continent as blockchain, let alone ballpark. So I think the trigger may have been pulled to axe that request a little too hastily. As I understand it, smart cards using this type of authentication via signatures instead of exposing private keys dates back to the 1990s at least. This is very old, very traditional, well-established mainstream technology, not newfangled blockchain chicanery.

In fact, my issue was just misunderstood, I made the mistake to talk about blockchain, while I was far from using it. I just tried to add arguments to make this issue more valuable, but it backfired.

I just need to perform tests where I log with smartcards and I too think the issue should just be taken into consideration again. Would you please post a comment on it to give it some credit ?

Exposing the private key stored on a smart card would be like exposing the core of a nuclear reactor. It's just fundamentally not how the device is supposed to work. Not ever. Not even once. We get electricity from the nuclear power plant like we get a signature from the smart card, but we don't want what's actually inside these devices to get out.

Plus, I would highlight that extracting the private key from a smartcard is not even possible for a pin protected one.

Still, you mention pkcs15, which is a different protocol than pkcs11, both are used with hardware certificates and does not conflict, some cards even support both. Differences are explained here.

It appears that this problem goes way beyond just Playwright. Most TLS implementations just do not support making a request using signatures without exposing private keys such as would be required to use a PIN-protected smart card. It's looking like to do this at all would require ripping into the Chromium source code to see how it does TLS authentication and trying to extract that part to create and maintain my own TLS implementation library and I doubt that is ever going to happen.

Why can't Playwright just automate clicking the mouse on the "Select certificates" modal in the browser? If this can't be solved then I'm going to need to get rid of Playwright altogether, replacing it with a GUI macro.

(later edit) It turns out that the "Select certificate" modal is actually from the operating system, not the browser, and that's why Playwright can't touch it. I had been thinking that it could be possible to work around this by doing the TLS handshake outside Playwright and then passing in the session already established but I can't do it outside Playwright either because other TLS implementations don't support hardware-based authentication.

The TLS request requires the private key, but smart cards are designed to never expose the private key, but the TLS request requires the private key. I have yet to find any exceptions to that vicious circle.

It appears that all automated browser testing software, including Playwright and Puppeteer, are fundamentally incompatible with smart card TLS authentication on an operating system level on both Windows and Linux. Fixing this would not be a simple issue of changing a few lines of code in Playwright or extending it. The whole infrastructure of the TLS implementation, the browser and/or the operating system would need to be changed for this. So the only way to automate controlling the browser for a site like this is going to be in a GUI macro.

This is a conclusion I've reached after researching this problem for several weeks now. Please show where I'm wrong if anyone has any additional information I haven't found. Please don't use LLMs for this topic because they only hallucinate solutions which don't and can't exist.

I did manage to find a solution, which was to write my browser test using PuppeteerSharp. When it commands the browser to navigate to the secure site, I simply set it to timeout: 0 for no timeout on that and then let the user put in their info manually. The browser automation then resumes when the page finishes loading. PuppeteerSharp is the only browser automation software for which I've managed to get such a "pause for manual intervention then resume" approach to work, despite trying a number of other approaches in Python and TypeScript.

microsoft / playwright