Open kayavila opened 9 months ago
It's likely not server side. Possible it's a bug with PuTTY, PuTTY CAC, or OpenSC. What sort of hardware token do you have?
It's a Feitan token.
Okay, I don't have that type of token but I can try reproducing the issue with the OpenSC library on another token (might be a few days). Is it possible for you to try using CAPI subsystem with the minidriver instead of the PKCS library? (https://fido.ftsafe.com/wp-content/uploads/2023/08/minidriver-2023-06-09.zip)
I need to do some reading but I believe so. It will be a few days on my side though too. Thanks for testing.
I can't find any reference on Feitan's site to those drivers, though they're clearly hosted by them. Can I ask what's the trail for finding the download? Thanks.
I got the download link from here: https://fido.ftsafe.com/piv-user-instruction/
Thanks!
@kayavila Did you have luck testing CAPI?
Would you say it's timing as anything to do with it? For example, is it more likely to occur if there are no transactions with the card for a certain period of time? We attempt to maintain a perpetual handle to the library in order to avoid unnecessary PIN prompts. It's possible that the connection is timing out.
Unforuately, I don't have a lot of logging setup in the PKCS/CAPI processing due to how the logging works in PuTTY. However, if we can narrow in on the likely culprit, I provide a popup message in a private build if an error occurs in a suspect area.
I haven't been able to get it to work with CAPI. I get prompted by the Windows UI to enter the smart card PIN, and then receive the same message from SSH (sign_and_send_pubkey: signing failed for RSA "CAPI:..." from agent: incomplete message
).
Yes, my experience definitely correlates with the failed state happening if the connection has been idle for a while.
I've noticed that Pageant does crash on fairly regular basis as well, but this is mostly associated with the computer hibernating and then coming back awake.
If I provide a private build with some debug popups, would you be able to run that for me?
Yes, I'd be happy to do that.
@kayavila Please try the attached binaries. They should make popup messages when an error occurs. I'd be most interested in the first messages when an error occurs. If you get them when things, are working "normally", let me know and I may have to remove some just so you can operate.
puttycac-0.80u1.zip
puttycac-64bit-0.80u1.zip
Thanks for working on this. It didn't take too long for me to get a failed condition today. The error was cert_pkcs_sign:225:Login Error.
Did you get prompted for a PIN right before, were you using PIN caching, or did you just get the error (without PIN caching)?
Also, please try the attached. This should add an additional error code to the message that will help me decide how I think we should best handle the error.
I believe I just got the error, but I will have to pay attention more closely next time. Just as a heads up, unfortunately I'm tied up for the next few days and may not be able to test until next week.
Alright. I'll actually be on travel all next week without access to my development system so no rush from my end. Based on where it's erroring out, either the cached PIN is getting corrupted (if you've enabled caching), the PIN isn't prompting properly (if you have caching disabled), or there's something internal to how OpenSC handles that token such that we might need to reestablish the connection to the library. I'm still waiting for Feitan to call me back and send me a free token but they are taking their time; I couldn't reproduce the behavior using a DoD CAC with the OpenSC PKCS #11 library.
Very interesting. I'm not sure about the PIN caching. I checked through the settings in Putty and Pageant and am not coming up with anything, except for the "Force PIN caching" setting in Pageant that's being referenced in another open ticket here. Is there somewhere else I should be looking?
Yes, I was referring to whether you have the "Force PIN caching" enabled or not.
Gotcha. No, I don't.
Thanks. So, when the error occurs, are you prompted for a PIN before it errors or does it just error without asking for a PIN. The error code would be helpful so please pass it along whenever you get it.
I had a failed state today. The error was cert_pkcs_sign:228:Login error (5)
. I did not get prompted for a PIN, and I don't have "Force PIN caching" enabled.
Alright, I think we are getting closer. Apparently the PKCS code was errantly caching PINs regardless of that "Force PIN Caching" setting. That doesn't quite explain your issue (since the PIN caching ought to have worked), but I made it so it properly honors that setting now.
Thanks, I'll give it a try and get back to you with results.
Got an error with cert_pkcs_sign:228:Login Error (160)
and immediately afterward, cert_pkcs_sign:229:Pin Length(x)
where x was the correct length of my PIN.
@kayavila Alright error 160 is basically the "Incorrect PIN" so either it's getting corrupted or the library is rejecting it for some reason. This version adds some informational popups when a) adding something to the cache, b) retrieving something from the cache, and c) immediately before asking you for the PIN when not found in the cache. It also adds a popup that will show you the PIN it's about to pass into the PKCS library; the PIN shown in cleartext within single quotes. I would be interested in seeing if that value is accurate. Obviously you don't need to send me that screen shot, but please let me know if it's accurate and obviously make sure nobody is shoulder surfing when you authenticate.
Thanks for help debugging this through these multiple iterations; I am not the best remote debugger.
I'm happy to help and appreciate you working on it. The value in the popup for my PIN was correct. Here's the pop-up order -
Prompting User For PIN!
cert_pkcs_sign:226:PIN Data: '...'
(correct value)
cert_pkcs_sign:231:Login Error (5)
cert_pkcs_sign:232:Pin Length (x)
(correct length)
I get the same error and order if I enter an invalid PIN.
When it's working properly - and this may be by design in how you have the pop-ups set up - I don't get the message with the pin length, just the first two messages. Thought I'd mention it just in case it's relevant.
Yeah, the PIN length messages are only set to popup when the login process fails. I'm about 90% sure this is some sort of bug or nuanced behavior in the OpenSC PKCS library. Before we resort to using PKCS spy, give this version a try. I definitely found an odd behavior when using my USAccess token and added some code to workaround it. I also removed the PIN displays since it seems we've established it's not a problem acquiring or passing the PIN.
@kayavila Any luck with that version?
I didn't use it enough last week to give it a good test. Got the usual cert_pkcs_sign:228:Login Error (5)
today though, unfortunately.
Error 5 is the PKCS "General Failure" error message. Can you try using "C:\Program Files\OpenSC Project\PKCS11-Spy\pkcs11-spy.dll" as your PKCS library. It should create a file in %TEMP% called pkcs11-spy.log. This may profile some valuable insight.
Sterilized, the last entry looks like this -
P:15324; T:19560 2024-03-05 09:57:47.734
[in] hSession = 0x36e2a6f0
[in] userType = CKU_USER
[in] pPin[ulPinLen] ...
Returned: 5 CKR_GENERAL_ERROR
Is the rest of the log of use to you? Is there a way I can provide it securely?
It might be. You can email it to Bryan.Berns@unnpp.gov or berns@uwalumni.com.
I may have the same problem here, with the same error message, although slightly different config and use case:
With Pageant loaded, and a CAPI key added, I have setup WSL2 to use Pageant's socket. In WSL2's ssh client, I get the same "incomplete message" message. In plink and putty, I get "Server refused public-key signature despite accepting key!" with no sign of error on the server.
I'm prompted several times with the "select a smart card device" pop-up, showing my Yubikey and sayng "smart card is ready for use" with OK/Cancel options. On pressing OK, I would normally be prompted for a PIN, but I'm not. Instead, the same pop-up repeats one or two times, and then plink/pageant/openssh fails.
The problem is intermittent. It's all good for many ssh sessions, and even unload/reload pageant cycles. It might work for days (with 5-50 ssh sessions a day) and then it stops working. The only remedy is a reboot.
Same issue with 3 different clients. Same issue with several different target hosts. Same issue with different Yubikeys.
I downloaded the builds with extra logging, but I don't seem to get any additional logs from them. Should I see a pop-up with the extra logs? Are the logs supposed to show with pageant, or with the putty/plink client?
Interestingly, if I load a key using PKCS11 (Yubico's libykcs11.dll) then it all works. I've never tried the PKCS11 library before. I can see now that the extra logging pop-ups are in the PKCS handling code, so when I use CAPI, there's no debug logging. So PKCS11 seems to be an OK work-around for me. But I have to say this seems to be really slow to add a key (like 2 or more seconds) and a bit of a pain to have to tell Pageant where the DLL is every time I add a new key. For these reasons, I'd really rather get CAPI working.
@jezzaaa I think these are probably two different but potentially related issues. As for the DLL annoyance, have you tried the option to remember certs/keys? I believe that should retain them in the list.
@jezzaaa I think these are probably two different but potentially related issues.
OK, so should I open a separate issue?
As for the DLL annoyance, have you tried the option to remember certs/keys? I believe that should retain them in the list.
Yes, I have that enabled. I suppose once I can get my hardware keys all working, I won't have to be removing and re-adding keys for troubleshooting/work-arounds and this will no longer be a problem for me.
@jezzaaa No need to open separate issue. It's interestingly though that you have to reboot to fix it. If it were a problem with PuTTY CAC itself, then just closing and reopening PuTTY/Pageant would fix it. It almost points to an issue with the Yubikey minidriver (basically the opposite issue that spawned this thread). This may be some sort of PuTTY CAC handling issue, but it's also possible that it could be an issue with something on the OEM front. When your system gets into this state, do other CAPI related things with the token work properly such as Remote Desktop (if you're using it as a PIV?).
@jezzaaa ... When your system gets into this state, do other CAPI related things with the token work properly such as Remote Desktop (if you're using it as a PIV?).
I only use my Yubikey for ssh. I don't have access to any RDP endpoints that I can test PIV auth against. If you can suggest something relatively simple I can setup for PIV/CAPI testing, I'd be keen to give it a try.
There might be some debug/logging options available in the Yubikey drivers.
Although my guess is that I'm not actually using the Yubikey minidriver. The reasons I think this is the case are: 1) I never installed the minidriver, because it all initially worked without me having to do anything[*] 2) When I follow the Yubico instructions to uninstall the minidriver, the application is not found, nor is the INF file found in the driver store 3) When I follow the Yubico instructions to turn on debugging for the minidriver, the registry folder/key doesn't exist
[*]I installed the YubiKey Manager software, and the Yubico PIV Tool, which may or may not have a minidriver included; I don't know, but I'll try to find out.
So perhaps my fix is to actively install the minidriver. I might give this a try and see if the problem goes away.
@kayavila Can you try this? I basically made it so it'll try three times if the library returns a generic failure. There's a more elaborate version of this I can try but let's try this first.
I'll give it a try and let you know.
@NoMoreFood It failed with the usual behavior.
@kayavila Alright, I changed the loop so it tries a fresh lookup on the card for up to three attempts. There's only one more level of reinitialization we can do after this.
Sorry for the delay in responding. I've been traveling but am testing it out today and will let you know.
On a related note, it looks like I can use SecureCRT with the same OpenSC dll and not have issues, if that helps tell you anything.
Okay, what I'm seeing now is a prompt for the PIN, a popup with "General error returned", another prompt for the PIN, and then Pageant crashes. However, I can restart Pageant, add the PKCS cert, and have an existing session connect back to the agent, so that's a big improvement for me.
@kayavila Can you see if this version behaves any differently?
You bet, will let you know. Thanks!
I had a failed state today. The pop window was titled, "PuTTY PKCS Signing Problem" with contents "Unexpected Error Returned: 00000005". It retried two more times (for a total of three tries) before completely failing out. Interestingly after I reopened Pageant after the failure, it did not crash. Adding and removing the key from Pageant didn't solve the issue, though. Closing Pageant, reopening it, and adding the key again got it back to a working state.
Interestingly, I just rebooted my machine and started Pageant and Putty, but hadn't added my token to Pageant yet. Having Putty directly access it, it went through the same loop of unexpected errors with the 0...5 message and failed three times. Opening a new session solved the issue.
I'm not sure if this is an issue with Pageant/Putty or OpenSC. I'm on Windows 10 and using putty-cac 0.80 with OpenSC 0.24.0.
The idea is to use a hard token with agent-forwarding to SSH from the Windows machine to server A into server B. Most of the time, it works fine, but intermittently, it reaches a state where trying to connect to server B fails with this error message (usually after prompting for the PIN) -
sign_and_send_pubkey: signing failed for RSA "PKCS:...=C:\\Program Files\\OpenSC Project\\OpenSC\\pkcs11\\opensc-pkcs11.dll" from agent: incomplete message
Output from
ssh-add -L
on server A is the same regardless of whether it's working or not. Neither disconnecting from A and reconnecting, nor removing the PKCS cert from Pageant and solves the issue. Pageant has to be closed entirely on the Windows machine and then the cert re-added.I don't believe this is a server-side issue, because once it reaches this failed state, I've tried multiple servers in the server A or server B role, and they all fail continuously until Pageant is restarted.
Are there any logs or any way of enabling debugging that I could use to provide more information?
ssh -vvv
doesn't give any more useful information aside from showing it working until it fails: