scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
53 stars 34 forks source link

SCT - siren-tests integration for 1-to-1 restore testing #4116

Open mikliapko opened 1 week ago

mikliapko commented 1 week ago

For 1-to-1 restore testing purposes, we need to check how the existing SCT restore tests can be used to validate the scenario:

mikliapko commented 1 week ago

@karol-kokoszka How important for these tests is to have initial cluster created via siren? Does it matter?

If not, we can considere an option to create initial cluster via SCT that, at first glance, may simplify things significantly.

karol-kokoszka commented 1 week ago

How important for these tests is to have initial cluster created via siren? Does it matter?

@mikliapko Yes, it matters. We must have possibility of doing complete e2e. Starting with Siren.

mikliapko commented 1 week ago

@mikliapko Yes, it matters. We must have possibility of doing complete e2e. Starting with Siren.

@karol-kokoszka Yeah, we will definitely need to have such E2E scenario.

I'm just wondering how significant it is for 1-to-1 restore? If 1-to-1 restore fully relies on backup manifest, can it be any potential differences in restore behavior for cluster created in Cloud and in SCT, for example?

karol-kokoszka commented 1 week ago

I'm just wondering how significant it is for 1-to-1 restore? If 1-to-1 restore fully relies on backup manifest, can it be any potential differences in restore behavior for cluster created in Cloud and in SCT, for example?

No, we don't expect differences here.

But how do you want to test the full flow then ? Cluster built on SCT won't test siren part.

I want to avoid the situation when we have the green light for manager's 1-1 restore, siren cluster preparation, but integration is missing. For e2e would prefer to not "mock" anything (siren for example') with other tools. Maybe SCT is not an option, or it introduces much of complication, then let's figure out how to use QA tools designed for cloud so that 1-1 restore is tested against the overall time (siren + sm), corectness of data restoration.

mikliapko commented 1 week ago

But how do you want to test the full flow then ? Cluster built on SCT won't test siren part.

I want to avoid the situation when we have the green light for manager's 1-1 restore, siren cluster preparation, but integration is missing. For e2e would prefer to not "mock" anything (siren for example') with other tools. Maybe SCT is not an option, or it introduces much of complication, then let's figure out how to use QA tools designed for cloud so that 1-1 restore is tested against the overall time (siren + sm), corectness of data restoration.

I got your point.

The thing is the number of potential cases we'd like to verify (including negative cases) might be quite big.

AFAIK, siren-tests doesn't provide such flexibility in terms of cluster creation (number of nodes, single/multi DC, specific Scylla version) as SCT. I'm also concerned about the test duration. In SCT it can be a way faster I believe. I might be wrong here, of course. @ilya-rarov could you please help us with that?

We may think about separating all the tests into several groups:

It's only the ideas for now, just trying to collect more data and details.

karol-kokoszka commented 1 week ago

We may think about separating all the tests into several groups:

E2E tests (the whole test is done via siren) some functional tests with cluster created in SCT It's only the ideas for now, just trying to collect more data and details.

We need to validate the full flow in one shot that includes siren + sm + data verification.

mikliapko commented 1 week ago

Actually, yeah, there is one more option to not utilize SCT at all, bring the missing parts (like C-S read/write) into siren-tests and perform the whole testing under the siren-tests repository.

karol-kokoszka commented 1 week ago

Actually, yeah, there is one more option to not utilize SCT at all, bring the missing parts (like C-S read/write) into siren-tests and perform the whole testing under the siren-tests repository.

SCT has a long history in Scylla. It creates very nice output, easy to debug. Metrics are persisted. Siren-tests should support it too, to have possibility of comparing the efficiency across different runs.