ansible-collections / community.aws

Ansible Collection for Community AWS
GNU General Public License v3.0
187 stars 395 forks source link

Filesystem creation fails when connecting over AWS SSM due to device busy #1620

Closed gygitlab closed 1 year ago

gygitlab commented 1 year ago

Summary

Bit of a headscratcher this one.

Basically when we run the community.general.filesystem module to create an ext4 filesystem for an EBS mount it always seems to fail as follows:

  cmd: /sbin/mkfs.ext4 -F -m 0 -F -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdf
  item:
    device_name: /dev/sdf
  msg: |-
    mke2fs 1.42.9 (28-Dec-2013)
    /dev/sdf is apparently in use by the system; mke2fs forced anyway.
    /dev/sdf: Device or resource busy while setting up superblock
  rc: 1

Thing is this works absolutely fine when connecting over SSH and when I run the command directly when manually signed into the box with aws ssm. Can only assume that there's some weird quirk happening, maybe trying to run the command twice? Any help would be greatly appreciated.

Issue Type

Bug Report

Component Name

community.aws.aws_ssm, community.general.filesystem

Ansible Version

$ ansible --version

ansible [core 2.13.6]
  config file = <redacted>/ansible/ansible.cfg
  configured module search path = ['<redacted>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = <redacted>/installs/python/3.10.6/lib/python3.10/site-packages/ansible
  ansible collection location = <redacted>/.ansible/collections:/usr/share/ansible/collections
  executable location = <redacted>/.asdf/installs/python/3.10.6/bin/ansible
  python version = 3.10.6 (main, Aug  4 2022, 11:58:04) [Clang 13.1.6 (clang-1316.0.21.2.5)]
  jinja version = 3.1.2
  libyaml = True

Collection Versions

$ ansible-galaxy collection list

Collection                    Version
----------------------------- -------
amazon.aws                    3.5.0
ansible.netcommon             3.1.3
ansible.posix                 1.4.0
ansible.utils                 2.7.0
ansible.windows               1.12.0
arista.eos                    5.0.1
awx.awx                       21.8.0
azure.azcollection            1.14.0
check_point.mgmt              2.3.0
chocolatey.chocolatey         1.3.1
cisco.aci                     2.3.0
cisco.asa                     3.1.0
cisco.dnac                    6.6.0
cisco.intersight              1.0.20
cisco.ios                     3.3.2
cisco.iosxr                   3.3.1
cisco.ise                     2.5.8
cisco.meraki                  2.11.0
cisco.mso                     2.1.0
cisco.nso                     1.0.3
cisco.nxos                    3.2.0
cisco.ucs                     1.8.0
cloud.common                  2.1.2
cloudscale_ch.cloud           2.2.2
community.aws                 3.6.0
community.azure               1.1.0
community.ciscosmb            1.0.5
community.crypto              2.8.1
community.digitalocean        1.22.0
community.dns                 2.4.0
community.docker              2.7.1
community.fortios             1.0.0
community.general             5.8.0
community.google              1.0.0
community.grafana             1.5.3
community.hashi_vault         3.4.0
community.hrobot              1.6.0
community.libvirt             1.2.0
community.mongodb             1.4.2
community.mysql               3.5.1
community.network             4.0.1
community.okd                 2.2.0
community.postgresql          2.3.0
community.proxysql            1.4.0
community.rabbitmq            1.2.3
community.routeros            2.3.1
community.sap                 1.0.0
community.sap_libs            1.3.0
community.skydive             1.0.0
community.sops                1.4.1
community.vmware              2.10.1
community.windows             1.11.1
community.zabbix              1.8.0
containers.podman             1.9.4
cyberark.conjur               1.2.0
cyberark.pas                  1.0.14
dellemc.enterprise_sonic      1.1.2
dellemc.openmanage            5.5.0
dellemc.os10                  1.1.1
dellemc.os6                   1.0.7
dellemc.os9                   1.0.4
f5networks.f5_modules         1.20.0
fortinet.fortimanager         2.1.6
fortinet.fortios              2.1.7
frr.frr                       2.0.0
gluster.gluster               1.0.2
google.cloud                  1.0.2
hetzner.hcloud                1.8.2
hpe.nimble                    1.1.4
ibm.qradar                    2.1.0
ibm.spectrum_virtualize       1.10.0
infinidat.infinibox           1.3.7
infoblox.nios_modules         1.4.0
inspur.ispim                  1.2.0
inspur.sm                     2.3.0
junipernetworks.junos         3.1.0
kubernetes.core               2.3.2
lowlydba.sqlserver            1.0.4
mellanox.onyx                 1.0.0
netapp.aws                    21.7.0
netapp.azure                  21.10.0
netapp.cloudmanager           21.21.0
netapp.elementsw              21.7.0
netapp.ontap                  21.24.1
netapp.storagegrid            21.11.1
netapp.um_info                21.8.0
netapp_eseries.santricity     1.3.1
netbox.netbox                 3.8.1
ngine_io.cloudstack           2.2.4
ngine_io.exoscale             1.0.0
ngine_io.vultr                1.1.2
openstack.cloud               1.10.0
openvswitch.openvswitch       2.1.0
ovirt.ovirt                   2.3.1
purestorage.flasharray        1.14.0
purestorage.flashblade        1.10.0
purestorage.fusion            1.1.1
sensu.sensu_go                1.13.1
servicenow.servicenow         1.0.6
splunk.es                     2.1.0
t_systems_mms.icinga_director 1.31.4
theforeman.foreman            3.7.0
vmware.vmware_rest            2.2.0
vultr.cloud                   1.3.0
vyos.vyos                     3.0.1
wti.remote                    1.0.4

AWS SDK versions

$ pip show boto boto3 botocore

WARNING: Package(s) not found: boto
Name: boto3
Version: 1.26.3
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email:
License: Apache License 2.0
Location: <redacted>/.asdf/installs/python/3.10.6/lib/python3.10/site-packages
Requires: botocore, jmespath, s3transfer
Required-by:
---
Name: botocore
Version: 1.29.3
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email:
License: Apache License 2.0
Location: <redacted>/.asdf/installs/python/3.10.6/lib/python3.10/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer

Configuration

$ ansible-config dump --only-changed

ANY_ERRORS_FATAL(<redacted>/ansible/ansible.cfg) = True
DEFAULT_FORKS(<redacted>/ansible/ansible.cfg) = 25
DEFAULT_LOAD_CALLBACK_PLUGINS(<redacted>/ansible/ansible.cfg) = True
DEFAULT_ROLES_PATH(<redacted>/ansible/ansible.cfg) = ['<redacted>/.ansible/roles', '/usr/share/ansible/roles', '/etc/ansible/roles', '<redacted>/ansible/roles']
DEFAULT_STDOUT_CALLBACK(<redacted>/ansible/ansible.cfg) = yaml
DISPLAY_SKIPPED_HOSTS(<redacted>/ansible/ansible.cfg) = False
HOST_KEY_CHECKING(<redacted>/ansible/ansible.cfg) = False
HOST_PATTERN_MISMATCH(<redacted>/ansible/ansible.cfg) = ignore```

OS / Environment

Host - Mac OS 13 Target - Amazon Linux 2

Steps to Reproduce

Try running the following module over AWS SSM to create a filesystem on a new disk:

- name: Create ext4 filesystem
   filesystem:
     fstype: ext4
     dev: "/dev/dsf"
     opts: "-m 0 -F -E lazy_itable_init=0,lazy_journal_init=0,discard"

Expected Results

Filesystem gets created without issue

Actual Results

Filesystem fails to create due to "device busy" even though it works fine directly and / or via SSH.

Code of Conduct

ansibullbot commented 1 year ago

Files identified in the description: None

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

markuman commented 1 year ago

community.aws 3.6.0

Does this also happen when you're using the latest community.aws 5.1.0 version?
What linux distribution is the target host?
I wonder if this is related to https://github.com/ansible-collections/community.aws/pull/1597 and https://github.com/ansible-collections/community.aws/pull/948

markuman commented 1 year ago

Edit: NVM can see you're on Ubuntu in other issues.

yeah, maybe one of those mentioned above might fix it. Or a combination of both, as mentioned here: https://github.com/ansible-collections/community.aws/pull/948#pullrequestreview-1187226989
unfortunately, both PR has stallen.

@gygitlab do you have a chance to update your community.aws collection to the latest and patch it like in the PR to see if it fixes your issue?

gygitlab commented 1 year ago

Oh interesting, I didn't go there as we get the collection as part of the main ansible package. Surprised to see quite a difference there.

I'll look at trying the latest version soon thanks.

Edit: To add we've seen this on Ubuntu and AL2

gygitlab commented 1 year ago

Interesting, this looks to work fine with 5.0.0 even without the PR! Thanks for the tip.