Closed allan2 closed 1 year ago
Server log:
ocis[86700]: {"level":"error","service":"idp","error":"ldap identifier backend logon connect error: LDAP Result Code 200 \"Network Error\": dial tcp [::1]:9235: connect: connection refused","time":"2023-08-13T12:34:56.123456789Z","message":"identifier failed to logon with backend"}
Hm, connection refuse usually means that nothing is listening on that port or that there is some reject rule in place on the firewall.
I am not familiar with ufw
, but might there be other rules rejecting traffic to port 9235?
Can you check whether the ocis
process is actually listenting on port 9235 using (ss
or netstat
) ? To rule out that the idm
service not starting properly. Startup logs with OCIS_LOG_LEVEL
set to debug
might also help here.
netstat -tuln | grep 9235
tcp 0 0 127.0.0.1:9235 0.0.0.0:* LISTEN
I didn't see anything further related to this issue when the log level is set to debug
.
ldapsearch -x -H ldaps://127.0.0.1:9235 -x \ -D uid=libregraph,ou=sysusers,o=libregraph-idm \ -W -b o=libregraph-idm objectclass=inetorgperson
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
LDAPTLS_REQCERT=never \ ldapsearch -x -H ldaps://127.0.0.1:9235 -x \ -D uid=libregraph,ou=sysusers,o=libregraph-idm \ -W -b o=libregraph-idm objectclass=inetorgperson
After inputting idm_password
from /etc/ocis/ocis.yaml:
ldap_bind: Invalid credentials (49)
Is it possible that the included IDM service isn't starting properly?
LDAPTLS_REQCERT=never \ ldapsearch -x -H ldaps://127.0.0.1:9235 -x \ -D uid=libregraph,ou=sysusers,o=libregraph-idm \ -W -b o=libregraph-idm objectclass=inetorgperson
After inputting
idm_password
from /etc/ocis/ocis.yaml:ldap_bind: Invalid credentials (49)
That means that the IDM service is at least running. ldap_bind: Invalid credentials (49)
means that the authentication failed. Either the password is wrong or something went wrong when initializing the IDM database.
As this is still a fresh install you could try to re-initialize the IDM database. You can do this by stopping ocis, removing the idm/ocis.boltdb
file from the ocis data directory (/var/lb/ocis/
or ~/.ocis
) and than starting ocis again.
If it still does not work after that. Please attach the full log with OCIS_LOG_LEVEL=debug
here.
Log attached.
This is what I did:
Clean:
systemctl stop ocis
sudo rm -rf /var/lib/ocis/*
sudo rm /etc/ocis/ocis.yaml
Initialize (answer yes
for config with certificate checking disabled):
sudo -u ocis ocis init --config-path /etc/ocis
Write log. This includes a single login request through the web portal.
start_time=$(date --date="-5 seconds" '+%Y-%m-%d %H:%M:%S') && systemctl start ocis && sleep 60 && sudo journalctl -u ocis --since="$start_time" > /home/al/ocis_log.txt
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"info","service":"idm","time":"2023-08-16T04:51:36.734675714Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/crypto/gencert.go:90","message":"written certificate to /var/lib/ocis/idm/ldap.crt"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"info","service":"idm","time":"2023-08-16T04:51:36.735468718Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/crypto/gencert.go:127","message":"written key to /var/lib/ocis/idm/ldap.key"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"debug","service":"idm","time":"2023-08-16T04:51:36.736016473Z","line":"github.com/owncloud/ocis/v2/services/idm/pkg/command/server.go:73","message":"Bootstrapping IDM database"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"debug","service":"idm","db":"/var/lib/ocis/idm/ocis.boltdb","time":"2023-08-16T04:51:36.736708809Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/log/logrus_wrapper.go:50","message":"Open boltdb"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"debug","service":"idm","db":"/var/lib/ocis/idm/ocis.boltdb","time":"2023-08-16T04:51:36.740057565Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/log/logrus_wrapper.go:50","message":"Adding default buckets"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"debug","service":"nats","time":"2023-08-16T04:51:36.882131416Z","line":"github.com/owncloud/ocis/v2/services/nats/pkg/logging/nats.go:45","message":"127.0.0.1:58868 - cid:14 - Client connection created"}
Aug 16 04:51:36 myhostname ocis[113236]: {"level":"debug","service":"nats","time":"2023-08-16T04:51:36.919894265Z","line":"github.com/owncloud/ocis/v2/services/nats/pkg/logging/nats.go:45","message":"127.0.0.1:58878 - cid:15 - Client connection created"}
Aug 16 04:51:37 myhostname systemd[1]: ocis.service: A process of this unit has been killed by the OOM killer.
Aug 16 04:51:37 myhostname systemd[1]: ocis.service: Main process exited, code=killed, status=9/KILL
Aug 16 04:51:37 myhostname systemd[1]: ocis.service: Failed with result 'oom-kill'.
Aug 16 04:51:37 myhostname systemd[1]: ocis.service: Consumed 1.175s CPU time.
Aug 16 04:51:37 myhostname systemd[1]: ocis.service: Scheduled restart job, restart counter is at 1.
Aug 16 04:51:37 myhostname systemd[1]: Stopped OCIS server.
Hm, ocis
was killed by the OOM killer upon first startup of the idm
service. This might have caused followup issues. (I.e. the initialization didn't complete).
I am not 100% sure that his really is causing your problem, but we should get that out of the way first. How much RAM does that system have? It might make sense to play with the GOGC or GOMEMLIMIT setting to limit the golang runtime's memory usage.
I had 512 MB RAM previously. I bumped to 4 GB and did a new fresh install. The log is attached.
The OOM killer message is now gone. However, the error still persists.
Lines 534–535:
Aug 17 06:33:04 myhostname ocis[2099]: {"level":"debug","service":"proxy","policy":"ocis","method":"POST","prefix":"/signin/","path":"/signin/v1/identifier/_/logon","routeType":"prefix","time":"2023-08-17T06:33:04.206185735Z","line":"github.com/owncloud/ocis/v2/services/proxy/pkg/router/router.go:222","message":"director found"}
Aug 17 06:33:04 myhostname ocis[2099]: {"level":"error","service":"idp","error":"ldap identifier backend logon connect error: LDAP Result Code 200 \"Network Error\": dial tcp [::1]:9235: connect: connection refused","time":"2023-08-17T06:33:04.206740666Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/log/logrus_wrapper.go:50","message":"identifier failed to logon with backend"}
Aug 17 06:33:04 myhostname ocis[2099]: {"level":"debug","service":"idp","request-id":"","proto":"HTTP/1.1","method":"POST","status":500,"path":"/signin/v1/identifier/_/logon","duration":0.359714,"bytes":44,"time":"2023-08-17T06:33:04.206800416Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/middleware/logger.go:27"}
Aug 17 06:33:04 myhostname ocis[2099]: {"level":"error","service":"idp","error":"ldap identifier backend logon connect error: LDAP Result Code 200 \"Network Error\": dial tcp [::1]:9235: connect: connection refused","time":"2023-08-17T06:33:04.206740666Z","line":"github.com/owncloud/ocis/v2/ocis-pkg/log/logrus_wrapper.go:50","message":"identifier failed to logon with backend"}
Argh, now I understand what's going wrong here. The idp
service is trying to contact idm
on the IPv6 address, but idm
is only listening on the IPv4 address. :astonished:
As a workaround, could you please try to set OCIS_LDAP_URI=ldaps://127.0.0.1:9235
to force it to use the IPv4 address for now.
It would still be interesting to know why your system behaves that way (while other's don't). Could you share a bit more infor about the system? What distribution are you running?
Thanks for the help! Login and sync is working now. I had to set two additional env vars.
Added to ocis.env:
# necessary for login
OCIS_LDAP_URI=ldaps://127.0.0.1:9235
# necessary for login (error initializing metadata client)
STORAGE_SYSTEM_DATA_SERVER_URL=http://127.0.0.1:9216/data
# necessary for sync (error doing GET request to data service, error doing PATCH request to data service)
STORAGE_USERS_DATA_SERVER_URL=http://127.0.0.1:9158/data
This instance is running Ubuntu 22.04.2 LTS. /etc/gai.conf is unmodified (fully commented, default values give IPv6 higher precedence than IPv4).
There could be other services still erroring because they are trying to connect to [::1]:PORT
. Any idea why this is happening?
Thanks for the help! Login and sync is working now. I had to set two additional env vars.
Glad that we found the root cause.
Added to ocis.env:
# necessary for login OCIS_LDAP_URI=ldaps://127.0.0.1:9235 # necessary for login (error initializing metadata client) STORAGE_SYSTEM_DATA_SERVER_URL=http://127.0.0.1:9216/data # necessary for sync (error doing GET request to data service, error doing PATCH request to data service) STORAGE_USERS_DATA_SERVER_URL=http://127.0.0.1:9158/data
This instance is running Ubuntu 22.04.2 LTS. /etc/gai.conf is unmodified (fully commented, default values give IPv6 higher precedence than IPv4).
There could be other services still erroring because they are trying to connect to
[::1]:PORT
. Any idea why this is happening?
No, not yet. I was hoping you had some local modifications to you gai.conf :smile: . But I guess it would already help if we'd teach ocis to list on all (v4/v6) loopback interfaces. Still need to dig a bit deeper into this.
@allan2 Could you please attach the /etc/host
file of your system? AFAICS Ubuntu does not map the localhost
name to the ::1
IPv6 address by default.
/etc/hosts
#127.0.0.1 localhost
::1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
so my system was mapping IPv6 loopack to localhost
but not IPv4.
I apologize for the trouble; I must have changed it when troubleshooting a separate issue a while back and forgotten about it.
After toggling the comments:
127.0.0.1 localhost
#::1 localhost
oCIS now working as expected without explicitly setting OCIS_LDAP_URI
and the other vars. Thanks again!
so my system was mapping IPv6 loopack to
localhost
but not IPv4.
Ok, that explains a lot. I was already questioning my sanity :laughing:
Meanwhile, after reading through some golang documentation and issues, I believe what we're doing (explicitly listening on 127.0.0.1 instead of localhost) is mostly. Golang's net.Listen()
will anyway only listen on the first IPv4 address returned after resolving the hostname.
What we don't support yet is configuring a list of IP addresses (IPv4 and IPv6) to listen on. (i.e. for a dual-stack setup). Though I am not sure if that is really needed. The proxy
service, which is the only one that actually needs to be acccessible from external already listens on 0.0.0.0 (INADDR_ANY) so that should already work in a dual-stack scenario IIUC.
Describe the bug
I have a fresh bare metal deployment on 3.1.0-rc.1 arm64. I'm unable to login using the default admin account.
Web portal response:
Server log:
Steps to reproduce
My setup is identical to the bare metal deployment (systemd + Nginx without the SSL hardening definitions). During init, I configured oCIS with certificate checking disabled (answered
yes
)./etc/ocis/ocis.env
More info
Ports should be open.
ufw status
I'm using the admin password generated during
ocis init
. I've verified that it's the same asadmin_password
underidm
in /etc/ocis/ocis.yaml.\\
Any help would be greatly appreciated!