snipe / snipe-it

A free open source IT asset/license management system
https://snipeitapp.com
GNU Affero General Public License v3.0
10.89k stars 3.14k forks source link

Docker LDAP Sync Works Sporadically and Fails Frequently #12632

Closed ashireman closed 1 year ago

ashireman commented 1 year ago

Debug mode

Describe the bug

While the LDAP setup seems to work fine, the test sync works only sporadically. All credentials/values are correct for syncing with the AD/LDAP server. SnipeIT will occasionally be able to reach the server and show a list of users that would be synced. However, it is inconsistent and I cannot figure out the cause of the issue. Running the artisan ldap:troubleshoot command provides ERROR: DNS lookup of host: 10.16.1.1 has failed. ABORTING. In the past when it has worked, I should have run the command at that time. If it works again, I will run the command at that time to test. There is nothing at the network level blocking the connection.

While there is nothing on the network blocking the connection, I feel like there may be a Docker networking issue. It seems that I am missing some kind of NAT between the container and the host networking interface. When the command is run within the container, can it only see the 172.17.0.0 subnet? But, when I run $ docker exec -it {container_id} curl -k -u "username" ldaps://10.16.1.1/OU=OU_Users,DC=int,DC=domain,DC=com?(&(sAMAccountType=8053 06368)(!(userAccountControl:1.2.840.113556.1.4.803:=2))), I can reach the server. The bind is rejected because I don't have the correct privileges, but it shows the server is reachable from the container. Why would curl be able to reach the server but php can't? More importantly, why does it work sometimes but not others?

Reproduction steps

  1. Enter all required LDAP configuration values
  2. Save the config
  3. Click Test LDAP Synchronization
  4. View error message Could not bind to LDAP: Can't contact LDAP server ...

Expected behavior

The connection should be made to the LDAP server. If there is an error, some additional troubleshooting should occur to see if there is an issue with the network connection (can Google be reached?) or the server itself is unreachable.

Screenshots

No response

Snipe-IT Version

6014

Operating System

Ubuntu

Web Server

Apache

PHP Version

7.4.3

Operating System

Windows

Browser

Edge

Version

110

Device

No response

Operating System

No response

Browser

No response

Version

No response

Error messages

Could not bind to LDAP: Can't contact LDAP server

Additional context

Fresh install from SnipeIT Docker image. Application has been working well for adding users, assets, locations, etc... without issue. The DB has not been manually touched in any way.

The few times it has worked it has seemed to only work for a few hours and then gone back to unable to connect. The steps taken to get it to work feel random. Have tried rebooting the server, restarting the containers, stopping and restarting containers individually. The one time it worked the best (as far as we can tell), we stopped the mysql container, rebooted the server, and restarted the containers. Then it worked. But the next day it was back to the same error.

welcome[bot] commented 1 year ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it. We get a lot of issues on this repo, so please be patient and we will get back to you as soon as we can.

ashireman commented 1 year ago

I forgot to include the output of php -m to show that the LDAP module is enabled:

[PHP Modules]
bcmath
calendar
Core
ctype
curl
date
dom
exif
FFI
fileinfo
filter
ftp
gd
gettext
hash
iconv
igbinary
json
ldap
libxml
mbstring
mcrypt
memcached
msgpack
mysqli
mysqlnd
openssl
pcntl
pcre
PDO
pdo_mysql
Phar
posix
readline
redis
Reflection
session
shmop
SimpleXML
sockets
sodium
SPL
standard
sysvmsg
sysvsem
sysvshm
tokenizer
xml
xmlreader
xmlwriter
xsl
Zend OPcache
zip
zlib

[Zend Modules]
Zend OPcache
aquadiode commented 1 year ago

This might not be too helpful depending on your situation but this was happening to me once when a single domain controller in the group was failing, it meant LDAP services would fail intermittently depending on which DC in the pool it was using (which is obviously not supposed to happen!). I think we tracked down the issue by connecting over LDAP to individual controllers until we found the problem server in the group.

ashireman commented 1 year ago

Thanks for the tip. We'll see if we can run some tests on all the DCs. No other problems have been reported but it can't hurt to give it a shot. What feels strange is that even though the test sync is failing and the ldap:troubleshooting command keeps returning the same error, syncing users seems to still work.

alvarorc959 commented 1 year ago

Same issue here. Unfortunately i can´t reproduce it on purpose. I manage to install it, and sync for once, after that i can´t sync it again. Currently using google LDAP

TShiremanLCOG commented 1 year ago

tldr; Make sure Netplan is using the correct identifier for Ubuntu.

For users running this on Ubuntu 18.04 and up and targeting Windows Server DHCP/DNS, this may be relevant. Through a ton of troubleshooting we determined that the issue is not with any kind of Docker related issue but with a broader networking problem on the host machine itself. This was eventually traced when we started noticing the application become unavailable and requiring a reboot of the host machine. While a reservation was made in DHCP to reserve an IP address, the host would occasionally receive an incorrect IP address. Since this was not consistent, it was difficult to find more areas to investigate. The issue was traced to Netplan, an "improved" network management tool for Ubuntu. A new default of using the machine id rather than the mac address, was the cause of the issue.

Note In any below code examples, replace the container_id and target_ip of the LDAP server with your values.

Troubleshooting We started with the ldap:troubleshooting command for SnipeIT:

$ docker exec -it {container_id} php artisan ldap:troubleshoot

Output STAGE 1: Checking settings Determined LDAP hostname to be: {target_ip} Performing DNS lookup of: {target_ip} ERROR: DNS lookup of host: {target_ip} has failed. ABORTING.

In our case we use an IP address for the LDAP server rather than a hostname. There may be a SnipeIT issue where attempting a hostname lookup of an IP address causes the error.

Networking We wanted to confirm that the Docker container could indeed reach the LDAP server. The following command was used to prove that the target IP was reachable from within Docker:

$ docker exec {container_id} php -r 'exec("curl -c 1 {target_ip}");'

In our case we received (as expected), a "Connection refused" error. LDAP does not operate on port 80 but this confirmed that the host was reachable. There may be a better test to perform but this was suitable for our needs.

With this in mind we looked deeper into the networking outside Docker. The issue where the host would seemingly go offline started to make more sense. The host could not be pinged by hostname or IP. While looking at DNS and DHCP didn't give us much, we happened to notice the wrong IP assignment. Expecting some kind of issue between Ubuntu and Windows DHCP, a few searches revealed the new defaults for Netplan. With that, we identified the changes required to the Netplan config.

Netplan Config Ensure that Netplan is using the correct DHCP identifier. You may edit the default or search for how to create your own config file to be merged. The default appears to be the machine ID rather than the MAC address. Add the below relevant config line:

dhcp-identifier: mac

After that, you will need to run the $ netplan generate and $ netplan apply command.

Once complete, the correct IP address was assigned and the LDAP appeared to sync without issue.

Two final recommendations before closing the ticket. 1.) Add a small note about sync issues for those running Ubuntu in a Windows environment to the documentation. Adding notes about the above checks regarding Netplan may assist users in getting LDAP set. 2.) Add a check to the LDAP Troubleshooter to see if the saved LDAP server is an IP address rather than a hostname (relevant line). Rather than perform a dns_get_record() lookup it may be beneficial to continue connectivity tests with the saved IP.

Ticket can be closed at your convenience.