cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.87k stars 3.77k forks source link

sql: update demo for new multi-region abstractions #62025

Open awoods187 opened 3 years ago

awoods187 commented 3 years ago

We currently have both cockroach demo --geo-partitioned-replicas and global. Neither exactly makes sense for 21.1. with our new abstractions.

I'm proposing a new --multi-region that shows the idealized state using movr like the previous --geo-partitioned-replicas (and deprecating the --geo-partitioned-replicas).

I'm also proposing a clean demo slate with locality set up for general docs usage or to build up demos. I currently use /cockroach demo movr --nodes 9 but perhaps without the movr?

Jira issue: CRDB-2689

awoods187 commented 3 years ago

It's a bit late in the cycle to squeeze in changes and the team is hard at work bug fixing so we are proposing the following plan of attack:

What do you think @rmloveland and @jseldess?

awoods187 commented 3 years ago

@jordanlewis any concerns with including --global here? I think its really nice to include but we could potentially drop it if there were concerns and use ./cockroach demo --nodes 9 --empty

andreimatei commented 3 years ago

Consider roachprod local in this conversation. At least for development, that's a much more useful platform cause you don't lose all the state with restarts.

jordanlewis commented 3 years ago

--global is unmaintained and fairly brittle and non-extensible. It might work okay for this purpose, but I'm concerned that people playing with it at our behest will run into confusing issues that we won't have the bandwidth to fix. I think the demo pipeline is pretty important to keep clean, because people will likely judge our product based on it after downloading.

If the docs team is okay with "testing" --global's suitability for this project in its current state, I feel reasonably okay about using it for the docs. But it's worth noting that we don't quality-test this feature past a very basic test that ensures that the database starts up.

As for brittle, this is what we do for the latencies. It's hardcoded and not really changeable for now. Note that we wouldn't have flexibility to change the number of nodes to be more than 9 either.

func init() {
    regionToRegionToLatency = make(map[string]map[string]int)
    // Latencies collected from http://cloudping.co on 2019-09-11.
    for pair, latency := range map[regionPair]int{
        {regionA: "us-east1", regionB: "us-west1"}:     66,
        {regionA: "us-east1", regionB: "europe-west1"}: 64,
        {regionA: "us-west1", regionB: "europe-west1"}: 146,
    } {
        insertPair(pair, latency)
        insertPair(regionPair{
            regionA: pair.regionB,
            regionB: pair.regionA,
        }, latency)
    }
}
awoods187 commented 3 years ago

I think it would be fine for us to use this behind the scenes for katacoda as we can control the number of nodes etc. I also think it would be fine to list the known limitations for docs and demos as providing a hard-coded configuration. cc @jseldess @rmloveland

Can we add some tests in the stability period now to harden this up? I'm not asking to extend the capabilities just to make 9 nodes with --global well tested enough for us in docs and to back katacoda like we do for the spatial tutorial.

rmloveland commented 3 years ago

If the docs team is okay with "testing" --global's suitability for this project in its current state, I feel reasonably okay about using it for the docs

I'll take a look at it. I'm not feeling super psyched about it due to old tech writer saying: "you have to build it in your basement". This is pretty much the opposite of that for a multi-region deployment. Feels like we are not giving users the real dogfood. (Also why roachprod is DQ'd - not dog food)

However, if it can do the following, it could be made to ~work, maybe?

  1. start cluster, latencies are "high"
  2. run some multi-region SQL where we say "make this table GLOBAL, make this table REGIONAL"
  3. cockroach moves some things around
  4. the "high" latencies start to "drop" in the DB Console

Will look at it.

jseldess commented 3 years ago

Thanks for investigating, Rich. Overall, I agree with your point in the linked ticket that setup/provisioning shoudn't be the focus of the tutorial, so if we can this to work via demo or docker, that'd be awesome.

rmloveland commented 3 years ago

Did some testing and wrote up my findings so far in a wiki page: https://cockroachlabs.atlassian.net/wiki/x/BQDpXg

tl;dr: I think it works? but I may be tricking myself, and it's not super easy to set up (still easier than doing everything via docker tho IMO)

awoods187 commented 3 years ago

@ajstorm have our concerns with demo been sufficiently addressed to check some parts of this issue off?

ajstorm commented 3 years ago

@awoods187 I checked off the one item we got to in 21.1 above. We need your movr changes before we can make progress on the rest of the items.

awoods187 commented 3 years ago

is the --global work all complete now? I somehow lost the tracking issue for that

ajstorm commented 3 years ago

I don't think it's all complete just yet. @otan can you comment on what's left?

On Wed, May 5, 2021 at 7:53 PM Andy Woods @.***> wrote:

is the --global work all complete now? I somehow lost the tracking issue for that

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cockroachdb/cockroach/issues/62025#issuecomment-833121747, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMXVOREM7OKWGNMS5GZ4YLTMHK7DANCNFSM4ZG7XJ6Q .

otan commented 3 years ago

--global is slightly less experimental than before, and is ready to be consumed by the user with many caveats inbuilt.

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!