scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
52 stars 34 forks source link

[SCT] Report backup/restore duration metrics to Argus #4027

Closed mikliapko closed 1 month ago

mikliapko commented 2 months ago

Experiment with the suggestion here and, if it works well for Manager, implement metrics reporting to Argus for Manager backup/restore benchmark tests.

mikliapko commented 1 month ago

Example of Argus implementation of graphs: https://argus.scylladb.com/test/1edac104-c90b-4126-9328-ba466981af52/runs?additionalRuns%5B%5D=41e27241-1535-4875-b7c5-ae20e1150dae

mikliapko commented 1 month ago

@karol-kokoszka @Michal-Leszczynski I would like to hear your opinion about default backup size and restore configuration for such restore benchmark job (the test that would be run release by release and check we didn't degrade in terms of restore speed). I'm about to choose 1TB ICS backup. What do you think about batch-size and parallel? Should we go with default values of them or set some specific values?

Michal-Leszczynski commented 1 month ago

1TB seems reasonable for a "quick" test. In terms of the restore flags, let's use the defaults. Default '--parallel' already restores as fast as possible, and we plan to do the same with --batch-size. New flags (e.g. --transfers) will also be set to max by default.