rposudnevskiy / RBDSR

RBDSR - XenServer/XCP-ng Storage Manager plugin for CEPH
GNU Lesser General Public License v2.1
58 stars 24 forks source link

can not use rbd-mode=kernel #59

Open ghost opened 6 years ago

ghost commented 6 years ago

edit: the problem does not exist with v2.0

xen server 7.2 / ceph luminous

creating a pdb with xe pbd-create sr-uuid=f9c45630a162 host-uuid=0174e10d-f6a9-4d2a-8cd8-d3118b40d375 device-config:rbd-mode=kernel

everything looks good until the VM is started - then i get:

tapdisk experienced an error

this is only happening with v1.0 - is it supposed to be working with kernel mode rbd?

I tried disabling rbd features for the image rbd feature disable RBD_XenStorage-f9c45630a162/VHD-a45fdf50-8d3d-4e5a-b310-eaa2385be4bb exclusive-lock object-map fast-diff deep-flatten but that didn't seem to have any effect. (it is necessary to do that if i want to map the image via rbd command)

rposudnevskiy commented 6 years ago

Hi, Try also this ceph osd crush tunables bobtail in addition to disabling the rbd features for the image It should help. If not then please send /var/log/SMlog

Due to limitations related to the fact that XenServer has pretty old kernel and krbd module and necessity to disable features, I would recommend using rbd-mode=nbd It is fast enough but supports all rbd features.

Please also note that v2.0 branch is not finished yet. Garbage collection and coalescing are not implemented yet

ghost commented 6 years ago

Hello,

thanks for taking the time! I am really happy that your project is existing!

I will try bobtail tunables, allthough i am not convinced that tunables are really the issue – we have kernel version 4.4.

Btw – either netinstall.sh should use BRANCH="1.0" or the zip should be created with v1.0 – atm the mv from RBDSR-v1.0 to RBDSR is failing (same for v2.0).

A second issue i had was with the installation of ceph. I had to install the ceph-common and the other packages one after another, to circumvent dependency errors.

Is there a reason for the use of version numbers? From install.sh: yum install -y -x librados2-12.0.3 -x libradosstriper1-12.0.3 -x librados2-12.0.2 -x libradosstriper1-12.0.2 -x librados2-12.0.1 -x libradosstriper1-12.0.1 ceph-common-12.0.0 rbd-nbd-12.0.0 rbd-fuse-12.0.0

Regards

Von: Roman Posudnevskiy [mailto:notifications@github.com] Gesendet: Montag, 13. November 2017 14:05 An: rposudnevskiy/RBDSR RBDSR@noreply.github.com Cc: Bernhard van Leenhoff | pc-web b.vanleenhoff@pc-web.at; Author author@noreply.github.com Betreff: SPAM [MessageLimit][lowlimit] Re: [rposudnevskiy/RBDSR] can not use rbd-mode=kernel (#59)

Hi, Try also this ceph osd crush tunables bobtail in addition to disabling the rbd features for the image It should help. If not then please send /var/log/SMlog

Due to limitations related to the fact that XenServer has pretty old kernel and krbd module and necessity to disable features, I would recommend using rbd-mode=nbd It is fast enough but supports all rbd features.

Please also note that v2.0 branch is not finished yet. Garbage collection and coalescing are not implemented yet

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rposudnevskiy/RBDSR/issues/59#issuecomment-343912746, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Af_NaMg4HTQ_XFqSR_DzlNjBkX2MTHNlks5s2D5fgaJpZM4QbkvO.

rposudnevskiy commented 6 years ago

Hi, The issue with branch name fixed.

The reason for the use of version numbers in install.sh was that in ceph after 12.0.0 they removed '--name' option of the rbd-nbd command that was used by plugin. The workaround for it (#53) was the preventing installation of ceph version newer then 12.0.0.

After that, it was found out that the command "rbd-nbd map/unmap" can be replaced with "rbd nbd map/unmap" which supports '--conf' and '--name' arguments and in v2.0 branch install.sh was fixed, but I forgot to fix v1.0 branch. Now it is fixed too.

ghost commented 6 years ago

Hi,

sorry to bother you again! Hopefully you can help me getting a clue about what is going on. Maybe its just a blunder on my side.. also not sure if this is the correct way to get this to you, or if it is better to open an issue on git.

After I installed the plugin at a fresh xen server installation and all was fine, I tried an existing one (still testing machine but rack server), that has only a slightly older version and some VM´s running on it.

I tried with / without kernel mode and without any device-config options, but I always get the same error:

OSError: [Errno 2] No such file or directory

When I want to plug the pbd.

Any input is much appreciated!

Below complete:

[root@xenserver-pcweb RBDSR]# xe pbd-plug uuid=ad4929d7-5816-2689-8845-b9ff757f7ab4 There was an SR backend failure. status: non-zero exit stdout: stderr: Traceback (most recent call last): File "/opt/xensource/sm/RBDSR", line 826, in SRCommand.run(RBDSR, DRIVER_INFO) File "/opt/xensource/sm/SRCommand.py", line 351, in run sr = driver(cmd, cmd.sr_uuid) File "/opt/xensource/sm/SR.py", line 147, in init self.load(sr_uuid) File "/opt/xensource/sm/RBDSR", line 216, in load cephutils.SR.load(self,sr_uuid, ceph_user) File "/opt/xensource/sm/cephutils.py", line 186, in load self.RBDPOOLs = self._get_srlist() File "/opt/xensource/sm/cephutils.py", line 173, in _get_srlist cmdout = util.pread2(["ceph", "df", "--format", "json", "--name", self.CEPH_USER]) File "/opt/xensource/sm/util.py", line 189, in pread2 return pread(cmdlist, quiet = quiet) File "/opt/xensource/sm/util.py", line 174, in pread (rc,stdout,stderr) = doexec(cmdlist_for_exec) File "/opt/xensource/sm/util.py", line 135, in doexec proc = subprocess.Popen(args,stdin=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE,close_fds=True) File "/usr/lib64/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory

[root@xenserver-pcweb RBDSR]# cat /etc/redhat-release XenServer release 7.1.0 (xenenterprise) [root@xenserver-pcweb RBDSR]# cat /etc/xensource-inventory PRIMARY_DISK='/dev/disk/by-id/scsi- DOM0_VCPUS='8' PRODUCT_VERSION='7.1.0' DOM0_MEM='2048' CONTROL_DOMAIN_UUID= XEN_VERSION='4.7.1' MANAGEMENT_ADDRESS_TYPE='IPv4' KERNEL_VERSION='4.4.27' COMPANY_NAME_SHORT='Citrix' PARTITION_LAYOUT='ROOT,BACKUP,LOG,BOOT,SWAP,SR' PRODUCT_VERSION_TEXT='7.1' INSTALLATION_UUID= LINUX_KABI_VERSION='4.4.0+2' PRODUCT_BRAND='XenServer' BRAND_CONSOLE='XenCenter' PRODUCT_VERSION_TEXT_SHORT='7.1' MANAGEMENT_INTERFACE='xenbr0' PRODUCT_NAME='xenenterprise' STUNNEL_LEGACY='true' BUILD_NUMBER='137272c' PLATFORM_VERSION='2.2.0' PLATFORM_NAME='XCP' BACKUP_PARTITION='/dev/disk/by-id/ INSTALLATION_DATE='2017-03-01 10:27:17.842537' COMPANY_NAME='Citrix Systems, Inc.'

Von: Roman Posudnevskiy [mailto:notifications@github.com] Gesendet: Montag, 13. November 2017 20:44 An: rposudnevskiy/RBDSR RBDSR@noreply.github.com Cc: Bernhard van Leenhoff | pc-web b.vanleenhoff@pc-web.at; Author author@noreply.github.com Betreff: SPAM [MessageLimit][lowlimit] Re: [rposudnevskiy/RBDSR] can not use rbd-mode=kernel (#59)

Hi, The issue with branch name fixed.

The reason for the use of version numbers in install.sh was that in ceph after 12.0.0 they removed '--name' option of the rbd-nbd command that was used by plugin. The workaround for it (#53https://github.com/rposudnevskiy/RBDSR/issues/53) was preventing installation of ceph version newer then 12.0.0.

After that, it was found out that the command "rbd-nbd map/unmap" can be replaced with "rbd nbd map/unmap" which supports '--conf' and '--name' arguments and in v2.0 branch was fixed, but I forgot to fix v1.0 branch. Now it is fixed too.

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/rposudnevskiy/RBDSR/issues/59#issuecomment-344034965, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Af_NaL527VK66bQ3HXTMNnDTV3t0cHfDks5s2JwFgaJpZM4QbkvO.

ghost commented 6 years ago

the reason for the above mentioned error was that the install of ceph-common failed. after I installed it "manually" everything works as expected. seems like the skript is still having issues with installing ceph-common. its a bit strange that only the one package failed. everything else was installed fine...

rposudnevskiy commented 6 years ago

Yes, it's strange. The only thing that I can assume for now is some temporary problems with access to the Internet. It would be nice to get the logs of the installation process. If you could reproduce the problem, it would be nice if you can send the output of the installation process. Thank you

ghost commented 6 years ago

I saw simillar things reported by other ppl that tried to use ceph for xenserver. I will try get more information at some point and get back to you if I can narrow it down some more.