vivint-smarthome / ceph-on-mesos

Ceph on Mesos
http://vivint-smarthome.github.io/ceph-on-mesos/
Apache License 2.0
20 stars 4 forks source link

Implement uninstall endpoint #7

Open fersantxez opened 7 years ago

fersantxez commented 7 years ago

The framework should have a uninstall feature which kills all tasks and deallocates their resources.

original post

I couldn't find a way to cleanly uninstall the framework. I've also noticed that when I run this from DC/OS, if I uninstall by just killing the scheduler, ceph-mon and ceph-osd tasks are left orphaned.

Maybe related, I've also noticed that when launching the frameworks on DC/OS, ceph-mon and ceph-osd tasks appear in the Mesos UI as registered with the "ceph" framework. Nevertheless, in the DC/OS interface they don't appear inside the "ceph" Service, as the workers for other frameworks do.

A clarification on a clean way to uninstall the framework and cleaning all associated tasks would be much appreciated.

timcharper commented 7 years ago

Great point, @fernandosanchezmunoz . This needs to be improved and it's not as easy as 1-2-3 right now, but it is possible.

1 - Set instances to 0 2 - Delete the ceph/tasks node in Zookeeper 3 - Restart the framework 4 - Go to the dangling resources tab of the UI and whitelist for removal 5 - Wait up to two minutes for those resources to be re-offered so they can be purged; restarting the framework again will help accelerate it.

fersantxez commented 7 years ago

thanks @timcharper ! Can you clarify what you mean with "1 - Set instances to 0" ? I've followed the procedure above and upon doing this, the osds and monitors did not get removed.

I restarted the framework by hitting "restart" on Marathon. I'm wondering if, given that this kills the scheduler and creates another one, it would be the right way to do it.

I then removed the ZK state, restarted Ceph, and still they were not removed. No dangling reservations were detected.

After uninstalling the framework, still the monitors and osds remain.

Anything I'm missing?

timcharper commented 7 years ago

@fernandosanchezmunoz Don't delete all of the zookeeper state, just delete the tasks node (and all children) AFTER you set the scale for all nodes down to 0.

fersantxez commented 7 years ago

thanks @timcharper ! that works great.

9600- commented 7 years ago

Building on this. In a misguided attempt at uninstalling the framework, I manually deleted a number of reservation releases through ZK prior to reading the full example documentation.

After doing so, when trying to spin up mons, I only receive the IDs and no containers are launched. It appears there are no actions being processed, though the mon tasks do pop up in ZK.

Can manually deleting the reservations break the framework? Thanks!