dm-vdo / vdo

Userspace tools for managing VDO volumes.
GNU General Public License v2.0
193 stars 32 forks source link

RHEL8 - VDO won't start and complains about ghost/non-started /dev/vdX disk #40

Closed samuelfusato closed 2 years ago

samuelfusato commented 3 years ago

Hello,

I am currently unable to create vdo volumes on a RHEL8 VM. I cannot bring vdo.service up. Please, see below details about it:

[root@virtmanager-rhel8 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.4 (Ootpa)
[root@virtmanager-rhel8 ~]# rpm -qa | grep vdo
kmod-kvdo-6.2.4.26-77.el8.x86_64
vdo-6.2.4.14-14.el8.x86_64

As per the below output, vdo.service fails to start complaining about a disk I have not started before (/dev/vdb). I have removed that disk but it keeps complaining:

[root@virtmanager-rhel8 ~]# systemctl status vdo
● vdo.service - VDO volume services
   Loaded: loaded (/usr/lib/systemd/system/vdo.service; enabled; vendor preset: enabled)
   Active: inactive (dead)

ago 01 00:26:58 virtmanager-rhel8 systemd[1]: Starting VDO volume services...
ago 01 00:26:58 virtmanager-rhel8 vdo[10018]: vdo: ERROR - Could not set up device mapper for vdo1
ago 01 00:26:58 virtmanager-rhel8 vdo[10018]: vdo: ERROR - vdodumpconfig: Failed to make FileLayer from '/dev/vdb' with No such file or directory
ago 01 00:26:58 virtmanager-rhel8 vdo[10018]: Starting VDO vdo1
ago 01 00:26:58 virtmanager-rhel8 vdo[10018]: ERROR - Could not set up device mapper for vdo1
ago 01 00:26:58 virtmanager-rhel8 vdo[10018]: ERROR - vdodumpconfig: Failed to make FileLayer from '/dev/vdb' with No such file or directory
ago 01 00:26:58 virtmanager-rhel8 systemd[1]: vdo.service: Main process exited, code=exited, status=1/FAILURE
ago 01 00:26:58 virtmanager-rhel8 systemd[1]: vdo.service: Failed with result 'exit-code'.
ago 01 00:26:58 virtmanager-rhel8 systemd[1]: Failed to start VDO volume services.
[root@virtmanager-rhel8 ~]#
[root@virtmanager-rhel8 ~]# lsblk 
NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0            11:0    1 1024M  0 rom  
vda           252:0    0   20G  0 disk 
├─vda1        252:1    0    1G  0 part /boot
└─vda2        252:2    0   19G  0 part 
  ├─rhel-root 253:0    0   17G  0 lvm  /
  └─rhel-swap 253:1    0    2G  0 lvm  [SWAP]

I have also noticed that the /etc/vdoconf.yml config file won't disappear from this directory after removing the installed packages. Here's the content of that file:

[root@virtmanager-rhel8 ~]# cat vdoconf.yml 
####################################################################
# THIS FILE IS MACHINE GENERATED. DO NOT EDIT THIS FILE BY HAND.
####################################################################
config: !Configuration
  vdos:
    vdo1: !VDOService
      _operationState: finished
      ackThreads: 1
      activated: enabled
      bioRotationInterval: 64
      bioThreads: 4
      blockMapCacheSize: 128M
      blockMapPeriod: 16380
      compression: enabled
      cpuThreads: 2
      deduplication: enabled
      device: /dev/vdb
      hashZoneThreads: 1
      indexCfreq: 0
      indexMemory: 0.25
      indexSparse: disabled
      indexThreads: 0
      logicalBlockSize: 4096
      logicalSize: 50G
      logicalThreads: 1
      maxDiscardSize: 4K
      name: vdo1
      physicalSize: 20G
      physicalThreads: 1
      slabSize: 2G
      uuid: null
      writePolicy: auto
  version: 538380551

Can you please advise?

Thanks.

samuelfusato commented 3 years ago

Update: I addressed this issue by shutting down the machine completely.

  1. shutdown the machine
  2. added two additional disks of 10GB each to my VM
  3. /etc/vdoconf.yml was updated after creating a new volume by running vdo create --name vdo1 --device /dev/vdb --vdoLogicalSize 10G
[root@virtmanager-rhel8 ~]# systemctl status vdo
● vdo.service - VDO volume services
   Loaded: loaded (/usr/lib/systemd/system/vdo.service; enabled; vendor preset: enabled)
   Active: active (exited) since Sun 2021-08-01 01:13:55 CEST; 7min ago
  Process: 841 ExecStart=/usr/bin/vdo start --all --confFile /etc/vdoconf.yml (code=exited, status=0/SUCCESS)
 Main PID: 841 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 8166)
   Memory: 0B
   CGroup: /system.slice/vdo.service
rhawalsh commented 3 years ago

Hi @samuelfusato,

It sounds like you likely had some previously created volumes that weren't completely removed. If you are certain that you do not have any volumes on the system, you can rename or remove the /etc/vdoconf.yml.

I don't think it would be wise to remove the config file when removing the packages if there are contents in it. There are values in there that you might want/need to retain in case you were only temporarily removing the packages.

Regarding the vdo.service and the vdo-start-by-dev.service failures. These services aren't implemented in such a way that there's a daemon running (like nginx or apache, for example). Instead, these services are just one-shot calls to the 'vdo' script, either sending a 'vdo start --all' or 'vdo start --name '. So if you end up with a failed service, you can always just try to start the volume by hand via the 'vdo' script. That obviously won't resolve any potential startup issues, but it would at least get your system back up and running if you are in a crunch.

Since I'm not entirely clear what your initial goal was, I'm not sure what other guidance I can provide. Please feel free to ask any other questions you might have and we'll see what we can do to help you out.

samuelfusato commented 3 years ago

Hello @rhawalsh,

Thanks for your kind response. It is clear now. I am currently studying for the RHCSA exam and VDO is listed as one of its objectives. I will make sure to wipe possibile signatures from the device before working with VDO on my exercises. By the way, now it is clear to me why the /etc/fstab entry does not count with the d on the vdo.service line: it is not implemented as a daemon running, as you explained.

Thanks again.