owncloud / ocis

:atom_symbol: ownCloud Infinite Scale Stack
https://doc.owncloud.com/ocis/next/
Apache License 2.0
1.4k stars 182 forks source link

Error: ldap identifier backend logon connect error: LDAP Result Code 200 "Network Error": tls: failed to verify certificate #8552

Open nikslor opened 8 months ago

nikslor commented 8 months ago

Describe the bug

ocis 5.0rc4 (and earlier versions too) starts but doesn't work and shows the following message in the logs when I try to log in:

{"level":"error","service":"idp","error":"ldap identifier backend logon connect error: LDAP Result Code 200 \"Network Error\": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-02-28T17:31:56+01:00 is after 2024-01-29T20:04:27Z","time":"2024-02-28T17:31:56+01:00","message":"identifier failed to logon with backend"}

I don't know exactly how I got into this situation. After a hardware failure, this instance was down for a few weeks - probably from before 2024-01-29 to after 2024-01-29.

Deleting the following files and restarting ocis fixed the problem:

Expected behavior

As far as I can see, the certs in /var/lib/ocis/idm and /var/lib/ocis/idp are automatically copied / generated, so if they are outdated, ocis should probably do one of the following things: a) refuse to start and give a proper error message b) copy/generate new versions of the files (the same way it was done originally)

Actual behavior

ocis starts but doesn't work properly, the administrator has to debug and find the solution on their own.

Setup

systemd based instance, with the following config:

OCIS_BASE_DATA_PATH=/var/lib/ocis ACCOUNTS_DEMO_USERS_AND_GROUPS=false PROXY_HTTP_ADDR=0.0.0.0:443 OCIS_URL=https://foo.bar.com PROXY_TRANSPORT_TLS_KEY=/etc/letsencrypt/live/foo.bar.com/privkey.pem PROXY_TRANSPORT_TLS_CERT=/etc/letsencrypt/live/foo.bar.com/fullchain.pem OCIS_INSECURE=false PROXY_ENABLE_BASIC_AUTH=true

micbar commented 8 months ago

@rhafer @dragonchaser @butonic what is your opinion on that? Do you agree with the expected behavior?

iFrozenPhoenix commented 7 months ago

Today I had the same issue. Simply renaming / removing the ldap.key and ldap.crt forces regeneration and everything is working fine again.

tkintscher commented 7 months ago

Had this issue today on 5.0.1. Followed the steps from the first post, now it works again. (running in Docker, behind Traefik, and using Keycloak as IdP)

dragonchaser commented 6 months ago

@micbar I agree,but I would not automatically update the certs, I'd opt for option (a) (refuse to start).

rhafer commented 6 months ago

Refusing to start with the expired certificate is ok I guess. One could also argue that it is ok to just accept the expired certificate (it insecure anyway since it is self-signed, and probably issued for the wrong subject).

Also it raises the question what we should do at runtime, when the certificate expires? Exit with an error? Or just continue to run an log error?

Somehow I think the real issue here is that we're enforcing SSL for LDAP even when the server (libregraph/idm) is just listening on the loopback interface, which is the case in the single binary setup. I guess it would be ok to allow unencrypted LDAP in that case.

iFrozenPhoenix commented 6 months ago

@dragonchaser why would you not regenerate it if the certs are expired and they are self certified? Not doing so raises the question why the certificates are generated at all at the first run if the app is not capable to renew it. Just to make the first run easy doesn't seems to be fair. I think either don't generate any certificate at all and require the user to provide the certs, at best from a public ca, or generate the certificates and manage them, i.e. regenerate it. I suspect 99 pct of the users (admins) don't even know that there are certs in the aio deployment until they run into this error. I would assume that this error will raise up in the next time because the first deployments now already run for a while.

7ritn commented 5 months ago

How can this bug still be present? Claiming OCIS is production ready, but expiring self-signed internal tls certificates require us to search through GitHub issues can not be acceptable.

iFrozenPhoenix commented 5 months ago

Well I guess because it's currently classified as an expected behavior. See comments above. It's also described how to renew it. If you want an automated way I guess owncloud is happy to support you with a subscription...

micbar commented 5 months ago

If you want an automated way I guess owncloud is happy to support you with a subscription...

Nice try... But joking aside, cert management is a PITA since we have software

Self signed certificates are anyway a bit of "snakeoil".

I agree that the admin experience is somehow not nice. The broader topic is, that oCIS is not a monolith. If you think about a LAMP stack with a DB, in the "old days" we didn't connect to the DB via TLS. Which was not secure. If an attacker could get access to the internal network, reading the unencrypted data stream would have been possible.

We decided that ocis should be "secure by default" and tried to make the initial setup as easy as possible.

I would agree that we can follow the advice from @rhafer and use no LDAPs if the ldap server is running on the loopback interface.

dragonchaser commented 5 months ago

How can this bug still be present? Claiming OCIS is production ready, but expiring self-signed internal tls certificates require us to search through GitHub issues can not be acceptable.

This is an opensource project, feel free to contribute.

kaivol commented 5 months ago

I think it would make contributing on this issue a lot easier if the team could state the expected behavior.

I would agree that we can follow the advice from @rhafer and use no LDAPs if the ldap server is running on the loopback interface.

Is that the desired solution?

micbar commented 4 months ago

I think it would make contributing on this issue a lot easier if the team could state the expected behavior.

I would agree that we can follow the advice from @rhafer and use no LDAPs if the ldap server is running on the loopback interface.

Is that the desired solution?

Yes. That is the desired solution. Do not use TLS when LDAP is on localhost.