tiredofit / docker-openldap-fusiondirectory

Dockerized OpenLDAP server with FusionDirectory Schema Support
41 stars 15 forks source link

Fatal error when upgrading from 6.7.0 to 7.0.1 #20

Open Manuki-San opened 3 years ago

Manuki-San commented 3 years ago

Hello,

The docker-compose.yml file that I use is pretty simple and it available here: docker-compose.yml

The interesting part if the URI used: ldap://OpenLDAPServer:389). String "OpenLDAPServer" only appears in environment variable LDAP1_HOST of service FusionDirectory. And my understanding is that this string refers to service OpenLDAPServer which itself defines a HOSTNAME. This work fine with 6.7.0. Now, with 7.0.1, it looks like the string defined in LDAP1_HOST is directly used as ldap URI and therefore generates ldap://OpenLDAPServer:389 instead of using the HOSTNAME

Could you please clarify the situation ?

tiredofit commented 3 years ago

I'm working with someone privately on a similar issue - I will report back with what I find out, you have given me great detail to work with.

tiredofit commented 3 years ago

I just tried a bunch of scenarios and unfortunately couldn't recreate it. I also noted, I didn't publicly release a 7.0.1 so if you have it you must have caught it before I deleted it from Docker Hub. The most current latest is based on 7.0.3 just with newer dependencies.

You can perhaps try fiddling around with /etc/fusiondirectory.conf in the web frontend to see if you can see whats happening with your BIND_DN ?

What you are seeing with the URI is correct, that seems to be normal activity with building the configuration.

Manuki-San commented 3 years ago

I have the same issue with 7.0.3. For the moment, I can continue using 6.7.0. What is confusing me is the fact that the change of the tag of the open ldap server in docker-compose.yml disturbs the capability of fusiondirectory to locate the server. I will continue investigating on my side as well.

Manuki-San commented 3 years ago

I have realized that the LDAP server was even not starting with 7.0.3. Command ldapsearch -x -h 192.168.1.33:389 -b ... ends with ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1). Of course this command works perfectly with 6.7.0.

I have noticed that, in the documentation, the data volumes for TLS certificates seems to have changed from /assets/slapd/certs to /certs/

I have updated environment variable PLUGIN_SSH=FALSE and the data volume to /certs/ and now I have the following errors in the log

OpenLDAP_Server    | chown: cannot access '/assets/slapd/certs/cert.pem': No such file or directory
OpenLDAP_Server    | chown: cannot access '/assets/slapd/certs/key.pem': No such file or directory
OpenLDAP_Server    | chown: cannot access '/assets/slapd/certs/ca.pem': No such file or directory
OpenLDAP_Server    | chown: cannot access '/assets/slapd/certs/dhparam.pem': No such file or directory
OpenLDAP_Server    | chmod: cannot access '/assets/slapd/certs/dhparam.pem': No such file or directory

and the open ldap server does not start

tiredofit commented 3 years ago

OK, you could have still stayed with the old folders to be honest by changing the TLS_CRT_PATH, TLS_KEY_PATH, and TLS_CA_CRT_PATH vars - Moving from 6 to 7 was ugly I will give you that and I apologize I didn't catch onto the magnitude of the change until now again as the majority of the chaos happened in the tiredofit/openldap repository.

Here's what I'd recommend if you still have access to the 6.7.0 image. Boot it up and run backups on your config (and data just for the heck of it).

slapcat -n 0 > /data/backup/config slapcat -n 1> /data/backup/data

Make sure you get those files out of the container and somewhere safe.

Then, open up the config file and replace any references to /assets/slapd/certs to /certs

Then lets get you on 7.0.3 again and start up the container, of course it will fail, but thats ok.

# stop the openldap service from starting
s6-svc -d /var/run/s6/services/10-openldap
s6-svc -d /var/run/s6/services/20-openldap-backup

## kill any slapd processes
pkill slapd

## delete configuration directory
mkdir /tmp/tiredofit-openldap/
cp -R /etc/openldap/slapd.d/docker* /tmp/tiredofit-openldap/
rm -rf /etc/openldap/slapd.d/*

### Re add your modified config
slapadd -F /etc/openldap/slapd.d -n 1 -L <yourconfigfile>
cp -R /tmp/tiredofit/openldap/* /etc/openldap/slapd.d/
s6-svc -u /var/run/s6/services/10-openldap

At this point in time you should have been able to migrate manually your config DIT to the 7.x series and all should be well. There are tonnes of other ways to do it, but this is what I would do in a flash. Let me know how you make out. I cannot stress the importance of backups.

Manuki-San commented 3 years ago

Obviously, after many trials and errors changing parameters, I broke the configuration on image 6.7.0 ... So, I did something radical (and I can because my instance of OpenLDAP is just a test bed) Considering the following mapping:

I performed the following actions:

  1. Save data files
    cd /media/HDD/openldap.home.lan/openldap/data
    sudo cp * ../backup
  2. Clean configuration files and certs
    cd /media/HDD/openldap.home.lan/openldap/config
    sudo rm -r *
    cd /media/HDD/openldap.home.lan/openldap/certs
    sudo rm -r *
  3. Change docker-compose.yml to use 7.0.3, start docker-compose, and verify that FusionDirectory works correctly
    sudo docker-compose up
  4. Stop docker-compose
    sudo docker-compose down
  5. Delete the new data files
    cd /media/HDD/openldap.home.lan/openldap/data
    sudo rm -r *
  6. Copy the old data files
    cd /media/HDD/openldap.home.lan/openldap/data
    sudo cp ../backup/*.mdb .
  7. And finally, start docker-compose again
    sudo docker-compose up

On FusionDirectory, the Groups and Roles are visible, Users have been lost. However, Command ldapsearch -x -h 192.168.1.33:389 -b ... is successful with the ¨lost" users. So the Users are not lost, there are simply not displayed by FusionDirectory.

tiredofit commented 3 years ago

Ya - makes sense to me. You lost your fusiondirectory schemas. You can reapply them manually by going inside the container and typing fusiondirectory-insert-schema in /etc/openldap/schemas/fusiondirectory and then fusiondirectory-insert-schema -i (whatever plugin you have installed) ie ssh.schema

Been there a couple dozen times when building the images out. Now that you are on 7.xx make sure you take advantage of the improved backup functionality. As long as you have backups even if you have to import data line by line (again, been there) at least you are saved..

tiredofit commented 3 years ago

Also, you may want to look at the Objectclass's of your users - that could be the reason why they are not appearing in the FD Web interface. As mentioned above, a good schema refresh should do it.

Manuki-San commented 3 years ago

Yes, it was a schema issue. The following commands brought back the display of users in FusionDirectory

sudo docker exec -it OpenLDAP_Server /bin/bash
cd /etc/openldap/schema/fusiondirectory
fusiondirectory-insert-schema -i mail*.schema

Except, that I still encounter error FATAL: Error when connecting the LDAP. Server said 'Could not bind to cn=admin,dc=home,dc=lan (while operating on LDAP server ldap://OpenLDAPServer:389)'. Let's be precise:

I do not want to abuse of your time, and your helped me already a lot, but do you have an idea of what it could be ? (I did not have this issue with 6.7.0)

tiredofit commented 3 years ago

Not sure there. Is docker-compose start trying to restart the container without bringing it up and down? That's sure to give problems as I create a whole bunch of files specific to the boot. I'd only recommend docker-compose down and docker-compose up -d ..

Manuki-San commented 3 years ago

docker-compose start (https://docs.docker.com/compose/reference/start/)

Starts existing containers for a service.

docker-compose up (https://docs.docker.com/compose/reference/up/)

Builds,` (re)creates, starts, and attaches to containers for a service.

So usually, the first time(s) I start new containers, I use docker-compose up, check the logs, then CTRL+C, then I use docker-compose start (because the container(s) already exist). I agree docker-compose up -d works, I just wanted to point out that docker-compose start does not work anymore with 7.0.3

Thank you for your precious help. It might be worth mentioning on github and/or dockerhub that the transition from 6.7.0 to 7.0.3 could be bumpy.