aws / efs-utils

Utilities for Amazon Elastic File System (EFS)
MIT License
285 stars 187 forks source link

Occasional failed mount on Debian 11 #225

Closed jgard closed 2 months ago

jgard commented 2 months ago

Opening and closing this just as a tip in case anyone else runs into this.

Symptoms:

Sometimes (10-20%?) on reboot, our EFS mount will not come up. /var/log/amazon/efs/mount.log showed ERROR - Failed to mount fs-0123456789abcdef because the network was not yet available, add "_netdev" to your mount options. We are already using _netdev. I found that error message here, in a check that network.target is up. When the mount failed, network.target is not up.

Cause:

This is on Debian 11. We've had a standard list of packages we've been installing for ages through many distros that includes fcoe-utils. As discussed here and resolved in v1.0.34-3, some versions of fcoe-utils create a circular dependency between itself, lldpad, and network.target. Sure enough that's exactly what we saw, systemd \~randomly~ choosing a job to delete to break the circle.

Workaround

For us, simply uninstalling, disabling or masking the fcoe-utils service is sufficient. We don't need it.