SovereignCloudStack / issues

This repository is used for issues that are cross-repository or not bound to a specific repository.
https://github.com/orgs/SovereignCloudStack/projects/6
2 stars 1 forks source link

New deployment and day-2-ops tooling software defined storage (ceph) - ADR #515

Open brueggemann opened 8 months ago

brueggemann commented 8 months ago

As a SCS Operator, I want a well considered and justified decision for a reliable method to deploy and operate ceph to replace ceph-ansible.

Criteria:

Tasks (see decision tracking document for detailed status):

Definition of Done:

Decision tracking document

yeoldegrove commented 8 months ago

For task "Gather information about reference setups from cloud providers (critera, migration path)"

questions to cloud providers and/or customers

We want to have a better understanding which Ceph setups, deployed by OSISM, you are currently running. We hope that this input helps us to decide on how to move forward on a possible replacement of ceph-ansible in OSISM.

horazont commented 8 months ago

We are using Rook for all our new Ceph deployments. Previously, we used Ceph-Chef.

Deployment Method: Rook is the natural choice for us because we are running Kubernetes on bare metal already for YAOOK Operator. It integrates well with YAOOK because both of them are using Kubernetes.

In addition, we have made excellent experience with the performance, maintainability and reliability of Rook.io clusters, in particular compared to our previous static deployment method (Ceph-Chef).

All methods have their downsides, and so does the Rook method. In partiuclar:

Version: With Rook, we are running 16.x with the plan to upgrade to 17.x soon-ish, though we are blocked there for non-Ceph and non-Rook reasons.

Hardware: Varying and historically grown, I'd have to look that up. Hit me up via email if you need that information: mailto:jonas.schaefer@cloudandheat.com.

Features: We use RBD exclusively with Rook so far (see above), we intend to enable S3 and Swift frontends once we implemented support for that (currently, these needs are served by our old Ceph-Chef cluster). We use CephFS in non-bare-metal cases, too.

berendt commented 8 months ago

We use the Quincy release (17.2.6) provided by OSISM 6.0.2 everywhere

We have a single small hyperconverged cluster for a specific customer workload. Otherwise we only use dedicated Ceph clusters. We currently have a single cluster that provides HDD and NVMe SSD as RBDs for Cinder/Nova. In addition, we have a cluster that is used exclusively for RGW and is offered as a Swift and S3 endpoint (integrated in Keystone and Glance, in future also Cinder (for backups)).

At the moment we are deploying the control plane on the Ceph OSD nodes and do not have any dedicated nodes for the control plane. We also do not split the data plane and control plane on the network side. We currently have 2x 100G in the Ceph nodes there. The compute nodes have 2x 25G (will be 2x 100G in the future as well). Latencies between the nodes are approx. 0.05ms (ICMP).

We have a separate pool for each OpenStack service (images, vms, volumes). We have several pools for Cinder so that we can partially separate customers.

We use the following services: osd, mon, mgr, rgw, crash. We would also like to take a look at mds in the future in order to be able to offer CephFS via Manila if necessary.

We can share details about hardware and the configuration in full if required.

We do not optimize the systems directly with the Ceph-Ansible part of OSISM, but use the tuned, sysctl and network roles from OSISM for this.

We are satisfied with what we can currently do with OSISM. We would only need more functionality in Day2 Operations in the future.

We have also recently added the option of deploying Kubernetes directly on all nodes in OSISM. We are open for Rook and cephadm. We are currently tending towards Rook as we believe it is the more consistent step.

fkr commented 7 months ago

@flyersa Can you give feedback as well? I think, it would be helpful.

Nils98Ar commented 7 months ago

Maybe we will switch to a hyper-converged setup (compute/storage/maybe network) and 25G interfaces in the future.

frosty-geek commented 7 months ago

Which ceph release are you running?

What is the size of your ceph cluster?

Are Ceph workloads sharing the hardware with other workloads (hyperconverged)? If yes, why?

Are you running multiple pools or even multiple clusters? If yes, why?

Which ceph features/daemons are you using and how are they integrated into OpenStack and/or other services?

Wishlist:

Which hardware are you using (either sizing or specs)? CPU/RAM

HDDs/SSDs/NVMEs(/Controllers)

Are you splitting "OSD setup" and "BlueStore WAL+DB" → 3x Controller Nodes running Ceph Management Components (Mons, MGRs, RGWs...), Hypervisors running only OSDs

NICs/speed/latency

Which Ceph config is deployed by OSISM? Do you mind sharing the actual config/yaml? Default config shipped with the reference implementation

Which Ceph config is deployed "unknown to" or "on top of" OSISM"? e.g. special crush maps, special configs

Would it be nice to have more Ceph features deployable via OSISM?

What is your justified opinion on a new deployment method for Ceph (instead of ceph-ansible)? What about Cephadm? Are you maybe already using it in your current cluster (deployed by hand)? What about Rook? Are you maybe already using it on top of a k8s deployed in OpenStack?

flyersa commented 7 months ago

Which ceph release are you running?

What is the size of your ceph cluster?

Are Ceph workloads sharing the hardware with other workloads (hyperconverged)? If yes, why?

Are you running multiple pools or even multiple clusters? If yes, why?

Which ceph features/daemons are you using and how are they integrated into OpenStack and/or other services?

Which hardware are you using (either sizing or specs)?

Mainly HPE such as Apollo 4200 or similar

Are you splitting "OSD setup" and "BlueStore WAL+DB"

of course

NICs/speed/latency

2x 40GIG or 4x 10Gig, depending on scenario and potential troughput

Are you using splitting Dataplane and Controlplane?

no, monitors and mgr usually go on the storage nodes

Which Ceph config is deployed by OSISM?

none, we never deploy with osism and use cephadm. We had customer faults based on user error in the past already damaging ceph clusters, thus we focus on a strong seperation from storage and openstack.

Would it be nice to have more Ceph features deployable via OSISM?

for others maybe, as i said that does not belong into the same system that i manage my compute resources for various operational topics

What is your justified opinion on a new deployment method for Ceph (instead of ceph-ansible)?

We should use what is used upstream, for ceph the tool is now cephadm, so of course we should use this

What about Rook?

While rook adds alot in regards to fault tolerance and so on it adds complexity too, i am not a huge fan of rook, in a CSP environment usually you have dedicated servers (if not HCI) for storage, no need to add a k8s cluster on top of it...

yeoldegrove commented 7 months ago

Our current decision tracking is done here: https://input.scs.community/3aZ-xdnRS-y11lZkrtAvxw

flyersa commented 7 months ago

btw. another point to freaking get rid of ceph-ansible... Ever did an upgrade? In the time this crap takes alone to upgrade a single monitor i upgrade complete datacenters to a new ceph version with cephadm...