linux-system-roles / storage

Ansible role for linux storage management
https://linux-system-roles.github.io/storage/
MIT License
102 stars 59 forks source link

storage: resize function for xfs FS can not work #119

Closed gitPo1son closed 11 months ago

gitPo1son commented 4 years ago

Pulled #97 to local test ,found the resize function does not work when the file system type is xfs. In fact, the size of the lv capacity has not changed when do resize from 10g to 15g,but the result output by the terminal shows passed.

BTW,resize function for ext2/ext3/ext4 works well.

environment: RHEL-8.2

playbook

---
- hosts: all
  become: true
  vars:
    mount_location: '/opt/test1'
    volume_group_size: '5g'
    volume_size_before: '10g'
    volume_size_after: '15g'
    storage_safe_mode: false

  tasks:
    - include_role:
        name: storage

    - include_tasks: get_unused_disk.yml
      vars:
        min_size: "{{ volume_group_size }}"
        max_return: 1

    - name: Create one LVM logical volume with "{{ volume_size_before }}" under one volume group
      include_role:
        name: storage
      vars:
          storage_pools:
            - name: foo
              disks: "{{ unused_disks }}"
              type: lvm
              volumes:
                - name: test1
                  fs_type: 'xfs'
                  size: "{{ volume_size_before }}"
                  mount_point: "{{ mount_location }}"

    - shell: lsblk | grep foo-test1

    - shell: mount | grep foo-test1

    - include_tasks: verify-role-results.yml

    - name: Change volume_size  "{{ volume_size_after }}"
      include_role:
        name: storage
      vars:
          storage_pools:
            - name: foo
              type: lvm
              disks: "{{ unused_disks }}"
              volumes:
                - name: test1
                  fs_type: 'xfs'
                  size: "{{ volume_size_after }}"
                  mount_point: "{{ mount_location }}"

    - shell: lsblk | grep foo-test1

    - shell: mount | grep foo-test1

    - include_tasks: verify-role-results.yml

    - name: Clean up
      include_role:
        name: storage
      vars:
          storage_pools:
            - name: foo
              disks: "{{ unused_disks }}"
              state: absent
              volumes:
                - name: test1
                  size: "{{ volume_size_after }}"
                  mount_point: "{{ mount_location }}"

    - include_tasks: verify-role-results.yml

eg:ext4's actions
"blivet_output": { "actions": [ { "action": "resize device", "device": "/dev/mapper/foo-test1", "fs_type": null }, { "action": "resize format", "device": "/dev/mapper/foo-test1", "fs_type": "ext4" } ],

but xfs's actions shows empty

output log

#resize to 15g
TASK [storage : debug] ******************************************************************************************************
task path: /root/ansible-test/upstream/storage/tasks/main-blivet.yml:113
ok: [192.168.122.101] => {
    "blivet_output": {
        "actions": [],
        "changed": false,
        "failed": false,
        "leaves": [
            "/dev/vda1",
            "/dev/mapper/rhel_node1-root",
            "/dev/mapper/rhel_node1-swap",
            "/dev/mapper/foo-test1",
            "/dev/vdc",
            "/dev/sr0"
        ],
        "mounts": [
            {
                "dump": 0,
                "fstype": "xfs",
                "opts": "defaults",
                "passno": 0,
                "path": "/opt/test1",
                "src": "/dev/mapper/foo-test1",
                "state": "mounted"
            }
        ],
        "packages": [
            "xfsprogs",
            "lvm2"
        ],
        "pools": [
            {
                "disks": [
                    "vdb"
                ],
                "name": "foo",
                "state": "present",
                "type": "lvm",
                "volumes": [
                    {
                        "_device": "/dev/mapper/foo-test1",
                        "_mount_id": "/dev/mapper/foo-test1",
                        "fs_create_options": "",
                        "fs_label": "",
                        "fs_overwrite_existing": true,
                        "fs_type": "xfs",
                        "mount_check": 0,
                        "mount_device_identifier": "uuid",
                        "mount_options": "defaults",
                        "mount_passno": 0,
                        "mount_point": "/opt/test1",
                        "name": "test1",
                        "pool": "foo",
                        "size": "15g",
                        "state": "present",
                        "type": "lvm"
                    }
                ]
            }
        ],
        "volumes": []
    }
}
.
.
.
.
.
<192.168.122.101> (0, b'', b'')
changed: [192.168.122.101] => {
    "changed": true,
    "cmd": "lsblk | grep foo-test1",
    "delta": "0:00:00.005214",
    "end": "2020-07-02 11:06:14.608691",
    "invocation": {
        "module_args": {
            "_raw_params": "lsblk | grep foo-test1",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "rc": 0,
    "start": "2020-07-02 11:06:14.603477",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "└─foo-test1         253:2    0   10G  0 lvm  /opt/test1",
    "stdout_lines": [
        "└─foo-test1         253:2    0   10G  0 lvm  /opt/test1"
    ]
}
vojtechtrefny commented 4 years ago

That's because Blivet (storage library used by this role) currently doesn't support resizing of XFS filesystem: https://github.com/storaged-project/blivet/issues/859

ashleykleynhans commented 4 years ago

The storage library fixed the issue 10 days ago, could you update the role to pull in the fix please?

pcahyna commented 4 years ago

@ashleykleynhans unfortunately no, because the role does not include the storage library, it uses whatever version of the library is found on the managed system.

ashleykleynhans commented 4 years ago

@ashleykleynhans unfortunately no, because the role does not include the storage library, it uses whatever version of the library is found on the managed system.

So if the updated storage library is installed on the managed system, this is basically a non-issue? Or will the role still not support it?

pcahyna commented 4 years ago

I hope that the storage role will pick the updated functionality automatically - @dwlehman is that the case please?

dwlehman commented 4 years ago

Yes, it should happen automatically with an updated blivet.

gitPo1son commented 4 years ago

Yes, it should happen automatically with an updated blivet. @dwlehman How can I update blivet, whether there is the latest rpm package for RHEL8? or use the upstream source code (https://github.com/storaged-project/blivet) to compile and install?

ashleykleynhans commented 4 years ago

@pcahyna @dwlehman installing the latest blivet doesn't work because of the following:

1) The storage role installs blivet from the OS repositories, so the library in the storage role uses that by default. 2) If I comment out the OS package, the fallback doesn't work because the library is called blivet.py and it tries to import from blivet, and hence tries to import from itself, which is incorrect.

Can we have an option to not install the blivet package from the repos, and can you rename the blivet.py library to something else that does not conflict with the blivet pip package please?

dwlehman commented 4 years ago

I don't see how the name of the blivet.py module could only be a problem when the blivet rpm is absent. I think if it were a problem it would also be a problem when the OS package is installed.

I can tell you that the pip package is not usable unless you already have all of the dependencies installed locally. We just discovered this recently and resolving it will not be trivial.

There is a blivet-3.3.0 package in Fedora 33 repositories now AFAIK. I don't know if you're in a position to install that, or whether it will work on earlier Fedora releases.

ashleykleynhans commented 4 years ago

@dwlehman the rpm installs a python package called blivet3, which does not conflict with the library name of blivet, hence it is a non issue for the rpm. However when installing blivet using pip, the python package is simply called blivet, and not blivet3, which is why it conflicts.

Unfortunately I am not using Fedora, I am using CentOS so not in a position to use the package in the Fedora repos.

The CentOS 8 package for blivet is only version 3.1.0, which is around 2 years old and does not contain the fix:

Last metadata expiration check: 2:48:02 ago on Tue 15 Sep 2020 03:40:05 PM SAST.
Available Packages
Name         : python3-blivet
Epoch        : 1
Version      : 3.1.0
Release      : 21.el8_2
Architecture : noarch
Size         : 995 k
Source       : python-blivet-3.1.0-21.el8_2.src.rpm
Repository   : AppStream
Summary      : A python3 package for examining and modifying storage configuration.
URL          : https://storageapis.wordpress.com/projects/blivet
License      : LGPLv2+
Description  : The python3-blivet is a python3 package for examining and modifying storage
             : configuration.
dwlehman commented 4 years ago

That rpm is not called blivet3 ; it is called python3-blivet (python3, not blivet3). Am I missing something? In RHEL/CentOS 8 it should be called blivet. Only on RHEL/CentOS 7 should it be called blivet3. On RHEL8 and all supported Fedora releases the blivet package name is blivet, so we know that the ansible module and the blivet package sharing a name is not generally a problem.

I will try to get you some idea of when the xfs resize feature will land in CentOS 8.

ashleykleynhans commented 4 years ago

Seems the problem only occurs when installing the blivet python library using pip then. It would be great if you could get me info on when it will land in CentOS 8, thanks 👍

ashleykleynhans commented 4 years ago

Seems the latest version of the role can only manage new storage volume groups, logical volumes and disks, and not existing ones?

I was able to successfully manage and resize a new logical volume within a new volume group, but was not able to manage existing ones.

dwlehman commented 4 years ago

I understand you cannot seem to get it to manage some existing volume group(s). Can you provide your playbook and a link to the /tmp/blivet.log you find (on the managed node) after reproducing the failure?

ashleykleynhans commented 4 years ago

blivet.log

Log file attached.

Playbook calls lots of different roles and things, but:

group_vars:

storage_pools:
  - name: "{{ hostname_without_dashes }}_vg0"
    disks:
      - /dev/vda2
      - /dev/vdb
    volumes:
      - name: root
        mount_point: /
        size: "20 GiB"
      - name: home
        mount_point: /home
        size: "5 GiB"
#      - name: tmp
#        mount_point: /tmp
#        size: "6 GiB"
      - name: var_log
        mount_point: /var/log
        size: "10 GiB"
      - name: opt
        mount_point: /opt
        size: "20 GiB"

and:

- name: Include role ashleykleynhans.blivet
  include_role:
    name: ashleykleynhans.blivet
    apply:
      tags:
        - storage
  tags:
    - storage
  when: include_role_storage is defined and include_role_storage == True

- name: Include role linux-system-roles.storage
  include_role:
    name: linux-system-roles.storage
    apply:
      tags:
        - storage
  tags:
    - storage
  when: include_role_storage is defined and include_role_storage == True

My role just basically installs the blivet pip module: https://github.com/ashleykleynhans/ansible-role-blivet

The above fails with:

TASK [linux-system-roles.storage : manage the pools and volumes to match the specified state] ****************************************************************************************************************************************************************
Wednesday 16 September 2020  17:06:20 +0200 (0:00:02.086)       0:00:48.709 ***
fatal: [mob-r1-d-f5331870-01]: FAILED! => {"actions": [], "changed": false, "crypts": [], "leaves": [], "mounts": [], "msg": "volume 'opt' cannot be resized to '20 GiB'", "packages": [], "pools": [], "volumes": []}

and the additional 10GB disk (vdb) is not being allocated to the volume group:

[root@mob-r1-d-f5331870-01 ~]# pvs
  PV         VG                       Fmt  Attr PSize   PFree
  /dev/vda2  mob_r1_d_f5331870_01_vg0 lvm2 a--  <51.00g 4.00m

51GB is just the primary disk (/dev/vda)

dwlehman commented 4 years ago

Ah, so what is not happening is the vgextend. That's why it cannot grow the opt volume. That's a bug, but it's not this bug. Please open a new issue describing this problem.

ashleykleynhans commented 4 years ago

Ah, so what is not happening is the vgextend. That's why it cannot grow the opt volume. That's a bug, but it's not this bug. Please open a new issue describing this problem.

166 logged for the new bug, thanks.

scaronni commented 1 year ago

Has anyone figured out how to do online resizing of XFS? I can't make it work.

For example on EL9 (Blivet 3.6.0) the module blows up saying that the file system need to be unmounted for an offline resize to work. The code got merged in 2020 in Blivet: https://github.com/storaged-project/blivet/pull/872

scaronni commented 1 year ago

Everything should be in place, the XFS grow code https://github.com/storaged-project/blivet/commit/8581a9e6e58c1593eb4d031c88ef62d1c7183c1f, along with the fix https://github.com/storaged-project/blivet/commit/bf2d1f45d2c7b41d4c12e7df883d599674edeb96 and the support for managing pool members at https://github.com/linux-system-roles/storage/commit/956ff6751420f85e220987b3739160bf5588a4d0 so this should work out of the box.

scaronni commented 1 year ago

Ok I figured out, the blivet packages containing the online resize for el8 and el9 are currently in CentOS Stream but not on released RHEL 8/9 versions...

richm commented 1 year ago

I believe https://github.com/linux-system-roles/storage/pull/356 added support for online resize, but yes, it depends on the functionality in the underlying blivet

richm commented 1 year ago

@scaronni so does the storage role online resize work if you have the right version of blivet?

scaronni commented 11 months ago

We're waiting for some internal gating to get us the latest snapshots based on 8.9/9.3 (non CentOS Stream), then I will test again.

ashleykleynhans commented 11 months ago

Will be nice to see some movement on this, its been more than 3 years.

richm commented 11 months ago

Will be nice to see some movement on this, its been more than 3 years.

It should work with the latest storage system role and the right version of blivet

scaronni commented 11 months ago

8.9 does not contain the necessary version that is in CentOS Stream, so resizing on RHEL 8.9 or derivatives is still not a thing.

scaronni commented 11 months ago

Sorry looked while the mirror were updating. Both 9.3 and 8.9 contain the necessary blivet packages required for resizing filesystems online.

I've tested it both on el8 and el9 and I can confirm that as long as the distributions are 8.9+ or 9.3+ online resize works just fine!

I think this ticket can be closed?

richm commented 11 months ago

Sorry looked while the mirror were updating. Both 9.3 and 8.9 contain the necessary blivet packages required for resizing filesystems online.

I've tested it both on el8 and el9 and I can confirm that as long as the distributions are 8.9+ or 9.3+ online resize works just fine!

I think this ticket can be closed?

Yes - thank you for confirming.