MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.72k stars 466 forks source link

Sql: ambiguity with the keyword "default" in `SET CLUSTER TO` #22351

Closed jkosh44 closed 8 months ago

jkosh44 commented 11 months ago

Feature request

The syntax SET var TO DEFAULT will reset var to it's default value. This can cause confusion if the user meant to set the value to the literal "default". This is especially confusing for SET cluster TO default because "default" is the name of the default cluster created in all environments. It would be helpful if we added a notice anytime a user typed SET var TO default and there exists an object of type var with a literal value of"default".

Additionally we should double check that we don't have SET cluster TO default anywhere in our docs and tests. If so, we should change it to SET cluster TO "default".

jkosh44 commented 11 months ago

Additionally, we may want to consider renaming the default cluster.

ParkMyCar commented 11 months ago

Now with default role variables, I think renaming the default cluster would be a good idea. Some alternative names I can think of:

jkosh44 commented 11 months ago

Another potential name: "materialize".

benesch commented 11 months ago

Spitballing here, but what if:

jkosh44 commented 11 months ago

We rename the default cluster to quickstart. I think this makes its purpose much clearer! It's there for folks to experiment with, but it's not an integral part of the system and can be dropped at any time. ... To backwards compatibly roll this out, we do a one-time migration to set the default cluster for any existing roles that don't already have a default cluster to default. That will prevent a service interruption for existing users and applications that rely on the default cluster being the default.

I like these suggestions.

We remove the default default for the cluster variable. New accounts don't have a default cluster. You have to explicitly configure the cluster when you connect, or use ALTER ROLE to set a default for your role only.

I'm a bit on the fence about this. You have to change the default values for roles one role at a time which can be pretty tedious. For people who just want to get started right away without thinking about clusters and variables it might be nice to have some default.

benesch commented 11 months ago

We remove the default default for the cluster variable. New accounts don't have a default cluster. You have to explicitly configure the cluster when you connect, or use ALTER ROLE to set a default for your role only.

I'm a bit on the fence about this. You have to change the default values for roles one role at a time which can be pretty tedious. For people who just want to get started right away without thinking about clusters and variables it might be nice to have some default.

Ah, fair point. I added two additional suggestions above to make the removal of the default less painful:

jkosh44 commented 11 months ago

I> > > We remove the default default for the cluster variable. New accounts don't have a default cluster. You have to explicitly configure the cluster when you connect, or use ALTER ROLE to set a default for your role only.

I'm a bit on the fence about this. You have to change the default values for roles one role at a time which can be pretty tedious. For people who just want to get started right away without thinking about clusters and variables it might be nice to have some default.

Ah, fair point. I added two additional suggestions above to make the removal of the default less painful:

* The SQL shell in the console will always pick an extant cluster (assuming you have at least one cluster) so that new users aren't left clusterless. That means the quickstart for the first user in the shell will be very smooth. (Also the tutorial could dynamically detect whether you have a cluster named `quickstart` and, if not, prompt you to make one!) IMO it's okay to do this kind of special logic in the shell, but not with other tools, because in the shell we have full control over the UI and display the cluster you're connected to front and center. Other tools have no special knowledge of clusters and they get too hidden away if we don't force users to explicitly set a cluster when using those tools.

* We consider allowing administrators to run `ALTER SYSTEM cluster = ...` to set a global default for the `cluster` variable if users don't have a role-specific override.

I think that's a good plan forward, and I would be OK removing the default value for clusters in that case.

ParkMyCar commented 11 months ago

We remove the default default for the cluster variable. New accounts don't have a default cluster. You have to explicitly configure the cluster when you connect, or use ALTER ROLE to set a default for your role only.

  • The SQL shell in the console will always pick an extant cluster (assuming you have at least one cluster)
  • We consider allowing administrators to run ALTER SYSTEM cluster = ... to set a global default for the cluster variable

I like this overall solution, but I just want to call out that removing a default value for cluster does increase the amount of friction for users when running their first queries in Materialize. Maybe this is okay though? It might force folks who to learn a bit about clusters up front which is a good thing?

An alternative would be to just rename the "default" cluster to "quickstart"?

For non-Console SQL Shell cases, we could also tie in https://github.com/MaterializeInc/materialize/issues/22350 to indicate that no cluster is selected?

benesch commented 11 months ago

For non-Console SQL Shell cases, we could also tie in https://github.com/MaterializeInc/materialize/issues/22350 to indicate that no cluster is selected?

Yeah, I think that'd be great.

benesch commented 11 months ago

An alternative would be to just rename the "default" cluster to "quickstart"?

I guess it just strikes me as really weird for the default cluster to be quickstart. That's just not going to be what you want in a serious production deployment. But! It just occurred to me: suppose we make the system wide default configurable with ALTER SYSTEM SET cluster .... Then the default default could be quickstart, but admins could change it as they like. We could even support ALTER SYSTEM SET cluster = NULL to unset the default. So each environment could choose to opt in to the "no default cluster" behavior, if that's what they want for their users, or they could choose a system wide default that makes sense for their usage of Materialize.

morsapaes commented 10 months ago

With cluster unification landing soon, removing the default cluster would be a step back in usability. At least until we have a smoother (self-)onboarding flow, I'd advocate for keeping it. I agree that renaming the cluster to quickstart is weird, though! We could use dev (or development), something more neutral like base, or some dynamically generated name based on e.g. the domain of the role creating the account.

benesch commented 10 months ago

With cluster unification landing soon, removing the default cluster would be a step back in usability.

When you say "removing the default cluster", do you mean 1) not having a cluster that is auto-created in fresh environments, or 2) not having a default for the cluster variable?

I agree that renaming the cluster to quickstart is weird, though!

The more I've thought about this, the more it's grown on me. It's pretty dang clear that you're not meant to do real work on a cluster named quickstart, but it can still invisibly power quickstarts and tutorials where we don't want to have to introduce you to the concept of a cluster.

Since it's a bit hard to follow the discussion above over multiple comments, here's what I'm currently advocating for:

morsapaes commented 9 months ago

When you say "removing the default cluster", do you mean 1) not having a cluster that is auto-created in fresh environments, or 2) not having a default for the cluster variable?

Both. I don't love quickstart since you might legitimately start building things out on this cluster (a few users have), but am also not strongly opposed.

benesch commented 9 months ago

I don't love quickstart since you might legitimately start building things out on this cluster (a few users have), but am also not strongly opposed.

Isn't that ... kinda great though? If you legitimately start immediately building things on the quickstart cluster you've used it exactly as its name suggests you should. And you can always ALTER CLUSTER ... RENAME TO ... later.

I was initially weirded out by it, but really the more I think about it the more I like it. Names like base or init are more neutral, but I worry that also gives off the impression that they are load bearing. Like you might be afraid to DROP CLUSTER init or DROP CLUSTER base because that seems like something that might break Materialize. Whereas DROP CLUSTER quickstart is more obviously an okay thing to do.

ParkMyCar commented 9 months ago

Following up here, I chatted in-person with @morsapaes and @ggnall, both are onboard for renaming the "default" cluster to "quickstart", so we're good to proceed with that work!

chaas commented 9 months ago

Proposal: only change the name for new environments, do not apply this retroactive to existing environments, since a) it's up to the user what to do with that cluster once we give it to them b) there may already be user/production dependencies on customers' default cluster

I've reviewed the docs,console & dbt adapter looking for references to or dependencies on the default cluster by name and we need to update the following user docs:

Things that shouldn't need changing:

I suggest announcing the change to all of eng in slack to see if there's anything else we're missing

benesch commented 9 months ago

Proposal: only change the name for new environments, do not apply this retroactive to existing environments, since a) it's up to the user what to do with that cluster once we give it to them b) there may already be user/production dependencies on customers' default cluster

Strong agree with this proposal!