Open tabbi opened 2 years ago
I spent some time looking into this. I managed to get the listing of backups in our GCS bucket from ~1:30m to ~8s.
The commit/branch is here: https://github.com/thelastpickle/cassandra-medusa/commit/e451049a2596f5ef7fbee37ee0d67a9cafea3593
The solution, in short, is to make the Medusa use asyncio (~ async def
) throught the code, and not just in abstract_storage
and its children. Then we can neatly list all the blobs (which takes about 5s) and then scatter/gather the cluster backup statuses (which read a tokenmap blob to work out node count each).
It was a lot of changes to do just this and I currently don't have the room to do this for other Medusa commands. Unles someone else picks this up, it'll have to wait for a refactoring week.
Hello! this issue is still open right? bcoz updating to latest 0.22.2 version didn't solve that issue :( the medusa list-backups command tooks us more than 1 hour to complete
Hi, yes, this is still and issue 🥲
Project board link
Hello! i have quite big folder with cassandra backups which use 1.9TB of disk space, and backups for 1 year are stored in this directory and medusa list-backups works about 10-20 minutes to show the list of backups, any ideas to fix that?
┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: MED-40