kchristensen / udm-le

Let's Encrypt support for Ubiquiti UniFi OS
MIT License
572 stars 79 forks source link

Wifiman does not work on UDMP if signer/intermediate cert is imported #30

Closed lcollins closed 1 year ago

lcollins commented 3 years ago

Incase this is asked by someone else, I have noted that when using udm-le to import create/import certs you may face issues with Wifiman in Unifi Network 6.1.x

I upgraded to Unifi Network 6.1.69 (beta) which gave me some valuable debug info when I toggle on/off Wifiman integration

server.log:[2021-03-15T00:51:08,542] <inform-13> ERROR system - Unable to read certificate from the unifi chain. There are 2 certificates, but exactly 1 is expected

server.log:[2021-03-15T00:51:09,241] <inform-13> ERROR dev - WiFiman enabled but could not find certificate, skipping config

Looking at the keystores (default vs LE keystore after ace.jar import) I observe the LE keystore includes the intermediate in the hierarchy. I know ace.jar wont allow importing without the signer cert.

default_unifi_keystore

letsencrypt_withsigner

working_letsencrypt

After I created a new JKS keystore with only the server certificate + key (PKCS12) I bounced Unifi Network and toggled on Wifiman and confirmed success.

I have reached out to Unifi Support, whom initially advised they do not support custom SSL certs and do not support CLI based changes on UDMP. I have advised them on my findings since this initial interaction and awaiting their response.

Blog404DE commented 3 years ago

Hi,

did you tested the Captive Portal with https activated? Because I'm not sure if the certificate without the chain is valid for the (most) browser.

lcollins commented 3 years ago

Hi,

did you tested the Captive Portal with https activated? Because I'm not sure if the certificate without the chain is valid for the (most) browser.

Works for me on iOS and macOS (Captive portal uses Safari in both cases). I have also used the UDM Dashboards in Chrome without any Certificate warnings.

But I can confirm Wifiman will not work at all if you have more than one certificate in the chain. UI Support have told me it's not a bug as they do not support using custom certificates on UDM (I would imagine they feel the same with the UXGs also)

As UI does not support this we need to support these use cases within the community. I have a custom version of these scripts locally which I am refining to support the Wifiman use case.

SamErde commented 3 years ago

That's a good find, and an....interesting excuse from UI. There's no good reason to only be able to handle one certificate in the chain.

kwschnei commented 3 years ago

Hi, did you tested the Captive Portal with https activated? Because I'm not sure if the certificate without the chain is valid for the (most) browser.

Works for me on iOS and macOS (Captive portal uses Safari in both cases). I have also used the UDM Dashboards in Chrome without any Certificate warnings.

But I can confirm Wifiman will not work at all if you have more than one certificate in the chain. UI Support have told me it's not a bug as they do not support using custom certificates on UDM (I would imagine they feel the same with the UXGs also)

As UI does not support this we need to support these use cases within the community. I have a custom version of these scripts locally which I am refining to support the Wifiman use case.

I'm very interested in how you modified the scripts to support Wifiman. Are you willing to share?

rloomans commented 3 years ago

@lcollins Are you able to share your modified versions of the scripts?

kchristensen commented 3 years ago

So I don't personally use Wifiman so forgive me if this is a dumb question -- if just importing the server certificate and not the immediate certificate fixes this, wouldn't this just be a matter of someone whipping up some sed to print the last certificate in the bundle to a file for use wherever Wifiman wants it?

kwschnei commented 3 years ago

I did try that at one point based upon some hints I found in UI's forum but didn't get anywhere. I'll dig into my error logs again and see if I can find the references I used before... maybe they'll make more sense to you. Certificates aren't really an area of expertise for me.

kwschnei commented 3 years ago

I found this post on UI's community, which includes comments from @lcollins - basically, it's saying the same thing - remove the intermediate certificate and create a new keystore (removing the intermediate certificate should be easy, but I have no idea how to create a new keystore on the UDMP).

If you've got suggestions, I'm more than happy to try things and test them.

kwschnei commented 3 years ago

It looks like the Unifi Easy Encrypt guy has addressed this in v 1.8.9 of his script. Post here and script here.

Unfortunately, I can't locate the prior version to run a diff to see how he did it. I'll see if I can get any details.

kwschnei commented 3 years ago

Well, I'm afraid the answer I got wasn't helpful:

"You can use my script instead, they are using different methods."

grahamhayes commented 2 years ago

I think adding --no-bundle to the lego command will give a single cert as the output from looking at the lego codebase. I will test and do a PR if it works.

lucacri commented 2 years ago

@grahamhayes did you end up digging deeper? I would love to use WifiMan but the certificate might be the reason why it's not working

grahamhayes commented 2 years ago

I did - Unfortunately https://github.com/go-acme/lego/issues/963 seems to still be affecting lego :/ I haven't had a chance to see if I can fix it or not yet

mtalexan commented 2 years ago

I notice udm-le is importing the new certificate as an entire second chain into the Java keystore, are you sure the WifiMan issue because there is more than just the leaf certificate in the chain and not the fact that there are now two distinct chains? Everywhere else I've seen, and how I've been doing it for years, imports the new chain as a replacement for the existing built in chain in the Java keystore rather than adding an additional chain. That also seems like a much more likely cause of a complaint from WifiMan about there being too many certificates, assuming they wrote it to do something like getChains().useChain() and the error is the thrown exception from useChain() about receiving a list of chains rather than a single one.

martintoreilly commented 2 years ago

@mtalexan I am seeing this issue with a single chain in the Java keystore.

From within a unifi-os shell:

root@ubnt:/# keytool -list  -keystore /data/unifi/data/keystore -v
Enter keystore password:  
Keystore type: jks
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: unifi
Creation date: Jan 11, 2022
Entry type: PrivateKeyEntry
Certificate chain length: 3
Certificate[1]:
...

SSH'd into the UDM directly as root outside the unifi-os shell:

# grep /data/unifi/logs/server.log -e "WiFiman" -B 1
...
[2022-01-11T03:55:06,884] <inform-4> ERROR system - Unable to read certificate from the unifi chain. There are 3 certificates, but exactly 1 is expected
[2022-01-11T03:55:06,893] <inform-4> ERROR dev    - WiFiman enabled but could not find certificate, skipping config
martintoreilly commented 2 years ago

@kwschnei I've reviewed the Unifi Easy Encrypt script (accessed on 11 Jan 2022 at version 2.0.0) and I can't see what he's doing that solves the multiple certificate issue.

I've manually recreated the keystore certificate chain replacement steps in the unifi_network_application() function (PEM -> PKCS12 conversion using openssl pkcs12 -export ... + keystore chain replacement using keytool -delete ... and keytool -importkeystore), and I get the full chain imported to the keystore and the same WiFiman "too many certs" error. I can't see anywhere where prior to this step where the full chain received from Let's Encrypt is being truncated to just the server cert.

It's possible I'm missing something, so when I get time I'll spin up a docker with the Unifi controller and confirm whether the full chain or just the server certificate is imported to the keystore when I run the full script.

martintoreilly commented 2 years ago

The following set of commands worked for me to load just the server certificate into the keystore (no chain).

  1. Export only the server certificate from the full chain bundle: openssl x509 -in /data/unifi-core/config/unifi-core.crt > /data/unifi-core/config/unifi-core-server-only.crt
  2. Bundle the private key and server-only cert into a PKCS12 format file: openssl pkcs12 -export -inkey /data/unifi-core/config/unifi-core.key -in /data/unifi-core/config/unifi-core-server-only.crt -out ~/unifi-core-key-plus-server-only-cert.p12 -name unifi -password pass:aircontrolenterprise
  3. Backup the keystore before messing with it: cp /usr/lib/unifi/data/keystore ~/keystore.backup
  4. Delete the existing full chain from the keystore: keytool -delete -alias unifi -keystore /usr/lib/unifi/data/keystore -deststorepass aircontrolenterprise
  5. Import the server-only cert + private key from the PKCS12 file: keytool -importkeystore -deststorepass aircontrolenterprise -destkeypass aircontrolenterprise -destkeystore /usr/lib/unifi/data/keystore -srckeystore ~/unifi-core-key-plus-server-only-cert.p12 -srcstoretype PKCS12 -srcstorepass aircontrolenterprise -alias unifi -noprompt
  6. Restart unifi service: service unifi restart

You can verify there is only a single entry in the keystore containing only the server certificate using keytool -list -keystore /data/unifi/data/keystore -v.

root@ubnt:/# keytool -list  -keystore /data/unifi/data/keystore -v
Enter keystore password:  
Keystore type: jks
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: unifi
Creation date: Jan 11, 2022
Entry type: PrivateKeyEntry
Certificate chain length: 1
Certificate[1]:
...
martintoreilly commented 2 years ago

After replacing the full cert chain in the keystore with just the server cert, I can confirm the WiFiman speed test started working again.

However, I was unable to connect to a network running a hotspot via either my Android mobile (Android 11 - OnePlus Oxygen OS 11.0.3.1 GMS57BA) or my macOS laptop (High Sierra - 10.13.6). On both devices I got stuck in a loop connecting to the Wifi network, with no captive portal / OS intercept dialog appearing (on my Android phone the connection status gets stuck on "Obtaining IP address").

Note that I only set up the hotspot a few weeks ago for testing, and reverting to using the full chain in the keystore and restarting the unifi service (service unifi restart from within the unifi-os shell) has not resulted in the hotspot captive portal working again, so it may be that I've broken some other aspect of the hotspot configuration. I'll investigate further when I have time.

rloomans commented 2 years ago

I too can confirm that @martintoreilly's commands make WiFiman work for me.

I tested this by replacing this line in udm-le.sh:

podman exec -it unifi-os ${CERT_IMPORT_CMD} ${UNIFIOS_CERT_PATH}/unifi-core.key ${UNIFIOS_CERT_PATH}/unifi-core.crt

with

podman exec -it unifi-os openssl x509 -in ${UNIFIOS_CERT_PATH}/unifi-core.crt -out ${UNIFIOS_CERT_PATH}/unifi-core-server-only.crt
podman exec -it unifi-os openssl pkcs12 -export -inkey ${UNIFIOS_CERT_PATH}/unifi-core.key -in ${UNIFIOS_CERT_PATH}/unifi-core-server-only.crt -out ${UNIFIOS_CERT_PATH}/unifi-core-key-plus-server-only-cert.p12 -name unifi -password pass:aircontrolenterprise
podman exec -it unifi-os cp /usr/lib/unifi/data/keystore /usr/lib/unifi/data/keystore.backup
podman exec -it unifi-os keytool -delete -alias unifi -keystore /usr/lib/unifi/data/keystore -deststorepass aircontrolenterprise
podman exec -it unifi-os keytool -importkeystore -deststorepass aircontrolenterprise -destkeypass aircontrolenterprise -destkeystore /usr/lib/unifi/data/keystore -srckeystore ${UNIFIOS_CERT_PATH}/unifi-core-key-plus-server-only-cert.p12 -srcstoretype PKCS12 -srcstorepass aircontrolenterprise -alias unifi -noprompt
martintoreilly commented 2 years ago

I too can confirm that @martintoreilly's commands make WiFiman work for me.

@rloomans Does the guest wifi hotspot with portal work for you after the change?

rloomans commented 2 years ago

Sorry, I don’t use the guest portal, so I don’t know.

-- Robert Loomans

On 11 Jan 2022, at 17:27, martintoreilly @.***> wrote:

 I too can confirm that @martintoreilly's commands make WiFiman work for me.

@rloomans Does the guest wifi hotspot work for you after the change?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

kwschnei commented 2 years ago

I can confirm this has solved this issue for me as well.

@kchristensen - do you want to make this change, or do you want me to attempt my first pull request?

martintoreilly commented 2 years ago

@kwschnei Are you able to confirm that you can successfully connect to a wifi network running a hotspot portal? This isn't working for me but I can't say whether it's due to the changing the certificate from a bundle to a server-only cert because I haven't got it working again after swapping back to a cert bundle. If everything works with the server-only certificate then I think we should make this the default behaviour. However, if the hotspot stops working then it might be better to add an option to select between wifiman or hotspot support.

martintoreilly commented 2 years ago

@kwschnei @kchristensen what do you think of the following suggestion? This preserves the current behaviour as default but allows individual users to decide whether to use the full chain bundle or a server-only cert for the hotspot captive portal / wifiman? If we then see that the server-only cert causes no issues with the hotspot captive portal across a wide enough range of devices, we could later decide to set the server-only certificate option as default behaviour.

  1. Add aNO_BUNDLE boolean flag in udm-le.env. This would mirror the currently inoperative --no-bundle flag for the lego command. I'm not wedded to the name and we might want to consider adding a suffix indicating that this only applies to the hotspot + wifiman functionality (is this the API endpoint for both of these or do they use separate endpoints)?
  2. Replace the call to the ace.jar import_key_cert program in udm-le.sh with a function that uses openssl and keytool to import either the full chain (default) or the server-only certificate (if NO_BUNDLE is set in udm-le.env).

_[Edit: I think it would also be worth adding a --force-deploy option to udm-le.sh to allow people to manually trigger a redeployment of their existing certificate after making a change to the NO_BUNDLE flag. The deployment code currently only runs if the certificate has been renewed.]_

kchristensen commented 2 years ago

@martintoreilly this seems like a reasonable approach I think, as long as the documentation is clear that:

  1. Exactly what setting NO_BUNDLE is supposed to do (I think captive portal/wifiman aren't super commonly used, so I don't want to confuse people unnecessarily)
  2. Setting NO_BUNDLE at least for now is experimental so YMMV

If you want to PR something up, I can merge it into main, and we'll hold off on a "release" for some period of time until we get some more feedback about how it's working?

Re: --force-deploy I've always just runudm-le.sh initial when I need to force a new cert or deploy since that will forcibly re-issue a new certificate and deploy it, is there something else your idea would do that isn't covered by that?

martintoreilly commented 2 years ago

If you want to PR something up, I can merge it into main, and we'll hold off on a "release" for some period of time until we get some more feedback about how it's working?

Sounds good. Happy to make a PR.

Re: --force-deploy I've always just runudm-le.sh initial when I need to force a new cert or deploy since that will forcibly re-issue a new certificate and deploy it, is there something else your idea would do that isn't covered by that?

I worry that forcing a re-issue runs the risk of people experimenting with / troubleshooting the server-only certificate hitting the 5 duplicate certificates per week limit (see extract below from the Let's Encrypt rate limit page).

Renewals are treated specially: they don’t count against your Certificates per Registered Domain limit, but they are subject to a Duplicate Certificate limit of 5 per week. Exceeding the Duplicate Certificate limit is reported with the error message too many certificates already issued for exact set of domains.

As we're talking about an initial experimental deployment we could go down another route. I could make the openssl / keytool based server-only cert import code a separate script (e.g. import_server_cert_only.sh), and those who want to try it out could just set the value of CERT_IMPORT_CMD in udm-le.env to ./import_server_cert_only.sh?

_[Edit: If we went with the option of a separate cert import script, we could add a flag to toggle server-only or full chain bundle. That way, those actively experimenting could manually run just the cert import script to toggle between these options, with amending CERT_IMPORT_CMD used to set the default option for automatic renewals to server-cert only]_

kchristensen commented 2 years ago

I think if the default behavior were to stay the same (deploying the bundle), we could get away with leaving this new behavior in the main script behind a boolean, and maybe just add a deploy step to the main case statement at the bottom to handle just running the deployment of the certificate, to avoid the certificate limit.

kwschnei commented 2 years ago

@martintoreilly I haven't had a guest portal setup previously, but I set one up quickly and tested this out - it is working for me using the password and voucher authentication types

IMG_0439 .

martintoreilly commented 2 years ago

@martintoreilly I haven't had a guest portal setup previously, but I set one up quickly and tested this out - it is working for me using the password and voucher authentication types

Brilliant. Thanks for testing this out 👍

martintoreilly commented 2 years ago

I think if the default behavior were to stay the same (deploying the bundle), we could get away with leaving this new behavior in the main script behind a boolean, and maybe just add a deploy step to the main case statement at the bottom to handle just running the deployment of the certificate, to avoid the certificate limit.

@kchristensen I've made the changes in PR #53.

martintoreilly commented 2 years ago

With the merge of PR #53 into main there is now a NO_BUNDLE setting in udm-le.env. Setting NO_BUNDLE='yes' when ENABLE_CAPTIVE='yes' will import only the server certificate to the keystore used by the hotspot Captive Portal and WiFiman, enabling WiFiman to work again.

After changing the value of the NO_BUNDLE setting, you can update the certificate in the Captive Portal / WiFiman keystore to match by running udm-le.sh update_keystore --restart-services.

Note that, although several people have reported that the hotspot Captive Portal works fine when the keystore only contains the server certificate, some clients might potentially find they either cannot connect to the Captive Portal or get an untrusted certificate warning if they do not already have a cached copy of the Let's Encrypt CA intermediate certificate(s) and are unable to download them when attempting to connect to the Captive Portal.

kchristensen commented 2 years ago

FYI this auto closed when I merged that PR, I'm happy to re-open for discussion if it is warranted!

martintoreilly commented 2 years ago

FYI this auto closed when I merged that PR, I'm happy to re-open for discussion if it is warranted!

Sorry, that was me. I'm so used to adding the "fixes X" to work PRs that I didn't think about the fact it would auto close the issue.

martintoreilly commented 2 years ago

Could those of you that try out the new code post here to confirm it works for you (including whether the hotspot Captive Portal works if you use it)? If we get a few people confirming it works and no-one having issues with the Captive Portal then I think we're ok to close this issue. Thanks in advance.

To update your UDM, you'll need to copy new versions of udm-le.sh, udm-le.env (updating with your LE email, server FQDN and DNS provider details) to /mnt/data/udm-le/ plus copying the new onboot.d/99-udm-le.sh to both /mnt/data/udm-le/onboot.d/ and /mnt/data/onboot.d/.

llajas commented 2 years ago

Thanks, gentlemen. I’ve been following along with this issue/request for several months now. Unfortunately I’m not as well versed as some of the people in this thread to make any sort of fix/recommendation.

While I haven’t tried the latest merge, I did swap out the podman commands at the end of ‘udm-le.sh’ as mentioned above by @rloomans which worked like a charm - I’ve got a wildcard cert provisioned to the hotspot guest portal as well as the controller portal and it’s working without issue, as well as WiFiMan. I’ve only tested iOS devices, but no issue that I’ve observed so far; perhaps it’s already got the root/sub CA’s items already, in my case.

Cheers to you all for the work here! I think the only thing I’ve yet to figure out is how to get this to work with the onboard unifi RADIUS server, but that’s beyond the scope of this thread.

Happy to provide any further information where I can, if it helps.

martintoreilly commented 2 years ago

@llajas Thanks for letting us know.

Regarding changing the radius cert, I've not used it myself, but I believe the radius server certificate + key will be updated to the Let's Encrypt issued ones by udm-le.sh if you set ENABLE_RADIUS='yes' in udm-le.env (though from the discussion in issue #44, it may require a reboot for the certs to be copied over and they may not survive an update).

[Edit: Maybe worth moving the conversation about RADIUS server certs to issue #44?]

erikankrom commented 2 years ago

How do I revert to default certificate configuration?

timrettop commented 2 years ago

@erikankrom looks like if you move (or remove) these two files (which were overwritten by the certs generated in the script) and reboot, the device generates new ones. mv /mnt/data/udapi-config/raddb/certs/server* /mnt/data/

erikankrom commented 2 years ago

@erikankrom looks like if you move (or remove) these two files (which were overwritten by the certs generated in the script) and reboot, the device generates new ones. mv /mnt/data/udapi-config/raddb/certs/server* /mnt/data/

@timrettop Is that just for radius, or also for wifiman and captive portal? Perhaps an enhancement to add a --uninstall argument to the script could easily remove the on_boot, and remove all the generated certificates prior to a reboot?

bigjohns97 commented 1 year ago

Just a heads up but this also fixed an issue with the latest 7.3 EA release, here is the error I was getting in server.log

[2022-09-23T23:09:47,429] <ble-load-keystore> ERROR system - Unable to read certificate from the unifi chain. There are 3 certificates, but exactly 1 is expected [2022-09-23T23:09:47,430] <ble-load-keystore> ERROR blebridge - unable to load local keystore for BLE bridge java.security.KeyStoreException: no local certificate found at com.ubnt.service.blebridge.G.Ooo000(Unknown Source) ~[ace.jar:?] at java.util.Optional.orElseThrow(Optional.java:408) ~[?:?] at com.ubnt.service.blebridge.G.publicsupersuper(Unknown Source) ~[ace.jar:?] at com.ubnt.ace.C$4.run(Unknown Source) ~[ace.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:829) ~[?:?]

Pretty crazy cause you would have never known it was unhappy if you didn't check the log, but it was spitting this out every 10 seconds.

kchristensen commented 1 year ago

Hi all, we have had some generous users work on getting 2.x support finalized, please check out this PR and see how things are working for your use case: https://github.com/kchristensen/udm-le/pull/70

kchristensen commented 1 year ago

The latest release is now out and I have reports that WiFiman is working as intended.