Open aarcane opened 7 years ago
I'm not a Ceph expert, but here's what I know: We support block devices, and to the extent that the kernel has support for RBD block devices then they should work with LIO. The other thing I know is that tcmu-runner also is in the process of adding support for RBD via librados userspace API.
I do not know anything about the patches you speak of. Maybe @mikechristie knows more.
I took the liberty of googling up the patches.
https://www.spinics.net/lists/target-devel/msg10330.html
The patches are apparently against fb41, and were posted to the target-devel mailing list. What I can't find with my limited knowledge of this repo, and the archaic way in which patches are still sent through e-mail is.. Were they ever applied into this tree?
The current upstream tcmu-runner code supports ceph rbd. It is a little slow, but we are working on it.
The patches in this thread you referenced: https://www.spinics.net/lists/target-devel/msg10330.html were for a different approach that was kernel based and implemented as a LIO backend module, target_core_rbd. The approach was rejected upstream. SUSE distributes it if you wanted to try it.
Red Hat and its RHEL 7.3 and RHCS 2.1 product have a Ceph rbd support using another method where it uses LIO's iblock support along with ceph rbd kernel driver block devices. With this approach you just use the ceph-iscsi-ansible/ceph-iscsi-config rpms that come with RHCS.
On Dec 6, 2016, at 10:49 AM, mikechristie notifications@github.com wrote:
The current upstream tcmu-runner code supports ceph rbd. It is a little slow, but we are working on it.
The patches in this thread you referenced: https://www.spinics.net/lists/target-devel/msg10330.html https://www.spinics.net/lists/target-devel/msg10330.html were for a different approach that was kernel based and implemented as a LIO backend module, target_core_rbd. The approach was rejected upstream. SUSE distributes it if you wanted to try it.
I believe this functionality it’s in the SUSE Enterprise Storage product.
Red Hat and its RHEL 7.3 and RHCS 2.1 product have a Ceph rbd support using another method where it uses LIO's iblock support along with ceph rbd kernel driver block devices. With this approach you just use the ceph-iscsi-ansible/ceph-iscsi-config rpms that come with RHCS.
-- Lee-Man Duncan
"Beer is proof that God loves us and wants us to be happy." -- Ben Franklin
In all fairness, I can't say most, but ime about half of the FOSS projects I've used include support for featuresets that are useful to some or all users, but may depend on outside functionality that may or may not be present, so long as they don't break compatibility or introduce burdensome dependencies.
Seeing as how the entire targetcli(-fb) application stack is based on dynamically only presenting the features supported on, not only the kernel, but the hardware present, I do believe it not unreasonable to request that this feature be merged into the targetcli-fb application stack. Therefore, I'd like to reclassify this as a feature request.
Needs to be upstream. This hasn't come up previously for backends but it has for fabrics, and we haven't merged non upstream fabrics until theyre in the mainline kernel.
Is there a good technical reason that you know if that it can't be merged? If it's purely convention, convention is the way to disaster. I'm more than willing to put in the effort, if needed, to update the code, but forking is bad, and I'd strongly prefer to not further fracture the code base, since targetcli{,-fb} is already... well, the deploy base is a mess, as I've experienced while looking for a distro wherein it actually works.. So.. If it doesn't apply cleanly, and the updates are non trivial, I'll submit a pull request, on the condition that, once all technical conditions are promptly resolved, It'll be accepted.
You can export rbd images with LIO using the iblock backend and the normal old block devices the rbd kernel driver creates today.
Why do you need these patches merged?
Exporting using the in-kernel RBD support makes the rbd exports multi-path safe. Using the userspace block devices makes them not multi-path safe. The whole purpose of ceph/rbd is to be redundant. Using the "normal old block devices" is just not safe.
Let's sync on some terms and issues.
For the target_core_rbd approach, the statement about it being multipath safe is not true when it comes to multipating across HA iSCSI targets. All of those approaches are not safe.
The problem with both target_core_rbd and the block layer approach right now is that they do not properly clean up outstanding IO when failing over. If you were to export an image through two targets, send some WRITEs through one path/target, then disabled the network so IO was stuck in that target's target_core_rbd/libceph/rbd layer, then you could end up with data corruption later, because that IO just sits around waiting for the network to come back up. When it does, it just executes possibly overwriting new data. The block layer approach is currently a little better in that you can set it up to take the exclusive lock so those outstanding IOs would be killed when failing over to the secondary target. But it still has a data corruption bug where if IO got stuck on a path/target before the exclusive lock is taken, that IO will not be cleaned up and you would hit the same issue.
The tcmu-rbd driver has similar issues.
I am currently working on adding support for tcmu-rbd to support HA. I originally wrote target_core_rbd. It was rejected and I never finished the eh. I worked on the block layer approach too, but I prefer the freedom I get with tcmu, so I am switching to that now.
I'm trying to find specifics, but looking through git logs on kernel.opensuse.org, it's clear that target_core_rbd is still being maintained and updated at least for the 12/42 release series. However, looking to the "stable" (Tumbleweed) and "master" (Factory) branches, the module seems to be missing entirely..
That said, given that the 42.2 default implementation of targetcli is completely broken, and given that the upstream official targetcli (this repo) has worked perfectly and reliably for me in the past, but presently crashes due to the presence of 'target_core_rbd' in either kernel or configfs (Not sure where the problem lies or how to figure it out...), I'm pretty much going to say I'm at a loss.. What's the best way to get upstream targetcli working with target_core_rbd implemented whether I like it or not? (If for no other reason, I like it because of the potential for performance boost of doing the work in-kernel vs passing everything through userspace context switches constantly... but if you (mikechristie) are giving up on the advantages of it, and suse is pulling it, too.... This entire parenthetical is just me coming to terms with a grim, context-switch-filled, performance-killing future.)
@aarcane Did you find the targetcli version, for librbd
aarcane was looking for a targetcli version that supports the kernel target_core_rbd module. I think that only exists in SUSE's repos. There is a nice project petaSAN
that has a GUI over targetcli and is SUSE based so it uses target_core_rbd.
If you are looking for one that supports librbd, then you can use Andy's upstream tree: https://github.com/open-iscsi/targetcli-fb
If you are using a Red Hat base distro then you can use Paul Cuzner's target cluster aware tool (some code over rtslib and configshell that runs on multiple iscsi target nodes and has targetcli like interface) gwcli:
https://github.com/pcuzner/ceph-iscsi-cli
Or, if do not have a red hat based distro and have time, you can modify that code too. Paul seems open to patches.
I've seen patches telling me it does and should, but when I rip out distro provided (and horribly broken) targetcli and try to load targetcli-fb, it chokes on rbd devices being loaded.. so uh.. have the RBD patches been merged yet, and if not, why not?