Open knz opened 1 year ago
Proposal by @jaylim-crl:
XXX-XXX-NN.
(two words and a number followed by a period), i.e. CREATE DATABASE "my-db-1.test"
would be forbidden, while CREATE DATABASE "mydb1.test"
would be allowed.XXX-XXX-NN.
, consider that to be the tenant name and strip it;Note: we would need to impose the restriction in point (1) on all clusters (including SH / dedicated), not just serverless, so that customers can migrate existing apps from one product to another without surprise.
Should we choose to go this way, we'd make the reserved prefix configurable (via a cluster setting).
Does postgres allow dots in database names? This seems like it might introduce ambiguity with the notation for database.schema.table. Or is the problem that you could quote things in sql so that would work but there is no such quoting for the connection string?
I would prefer a simpler rule like "no dots" instead of a more complex one that only matches XXX-XXX-NN
(the latter rule also constrains our ability to evolve the tenant name scheme), if we can get away with it from a backwards-compatibility perspective.
Does the introduction of SNI-based routing give us a path to deprecating and removing this parsing of the database name? IIRC putting the tenant ID in the database name was the last choice, if neither SNI nor the options parameter were supported.
Does postgres allow dots in database names?
Yes. CREATE DATABASE "my.db"
This seems like it might introduce ambiguity with the notation for database.schema.table
No. Double quotes are required in that case. SELECT * FROM "my.db".schema.table
That said if a db name contains periods, it's possible to specify it in the conn URL without quoting: postgres://../my.db
. This works with regular pg client/server.
if we can get away with it from a backwards-compatibility perspective.
That's what we started with. The db names can already contain periods, that cat is out of the bag.
Does the introduction of SNI-based routing give us a path to deprecating and removing this parsing of the database name?
For serverless, yes. But generally we've wanted to offer a non-SNI option as well, and the other alternative, via the options
status param (postgres://.../..?options=-ccluster=XXX-XXX-NN
), is not supported by all drivers.
IIRC putting the tenant ID in the database name was the last choice, if neither SNI nor the options parameter were supported.
Correct, but we do want a solution for #92580 that works with non-TLS configurations as well, and with drivers that don't support options
.
I would prefer a simpler rule like "no dots" instead of a more complex one that only matches XXX-XXX-NN (the latter rule also constrains our ability to evolve the tenant name scheme)
My current thinking is that the reserved prefix would be a configurable regular expression. Its default value (used in CC and SH clusters using default) would be something a bit more generic, something like ([a-zA-Z]+-){2,}\d+\.
which also includes xx-xx-xx-xx-123.
. A SH customer who really wants to use a special prefix in their DB name could then customize this to their liking.
in SQL, forbid the creation of database names starting with the structure XXX-XXX-NN. (two words and a number followed by a period)
This is not quite right, because tenant names are not required to have two words in them. This is only the form of the default cluster name that we generate, but the user can override it with whatever they like.
The pattern that we’d have to prohibit would instead be: “XXX-NN.”
The pattern that we’d have to prohibit would instead be: “XXX-NN.”
This is the regex that we use for cluster names:
XXX would be the cluster name, and NN will always be a number, i.e. tenant ID.
That regexp is too inclusive. It would forbid the (imho, legitimate) use of test123.mydb
as a db name.
That's why i insisted on having the hyphen and a minimum of two words ((\l[\l\d]*-){2,}...
) which, IIUC, includes all current serverless cluster names.
Also we will need to talk about that number at the end if we start considering using tenant names instead of IDs
Oh and so the user can customize their cluster name...
We really have made a mess of this. The complete disregard for PG identifier rules in that decision process is... Unsettling.
Okay at this point I'm starting to think we should push for a change in the serverless connection URLs to mandate a prefix before the cluster name. For example the URL for cluster name foo
would be e.g. postgres://.../cluster:foo.db.name/
Then we'd reserve the db namespace that starts with cluster:xxx\.
.
Then we'd go to our serverless users who use the db name variant to add that prefix in their apps.
After that prefix becomes mandatory (in the db name position) we will become able to recognize composite db names reliably, as it will be clear when the cluster name prefix is not included.
Currently the sqlproxy is implemented so that if SNI information is provided, SNI is only used if the sqlproxy is unable to extract a tenant from the option or the DB.
I think we can sidestep most of the pain by changing the precedence:
Do you mean by "the behavior is a little weird" that IF the fall back ends up using the db name, we will always pick up the cluster name before the period, but there can still be a db name with a period after that?
Do you mean by "the behavior is a little weird" that IF the fall back ends up using the db name, we will always pick up the cluster name before the period, but there can still be a db name with a period after that?
Yep!
If someone selects a database name that starts with a tenant name and the client does not support SNI it can lead to a confusing error message. The user may see a message saying that the tenant does not exist or an authentication failure if the tenant does exist.
I think I'm ok with that. I'll modify #92619 accordingly.
If SNI is provided and the DNS name matches the cluster's dns zone
Note that these will not match in the case where a tenant cluster has been moved from one host to another. For example:
dim-dog-100.7tc.cockroachlabs.cloud
The 7tc
does not need to match the current host cluster short-id if that tenant has moved. Instead, our check would need to match against regex's more like this:
*.\.[^-]+\.cockroachlabs.cloud
*.\.[^-]+\.gcp-us-east1.cockroachlabs.cloud
In other words, we're looking for an SNI name that contains a host short-id component, because that implies that it also contains a tenant name.
There are a couple of other possible misunderstandings here that I want to ensure are clear:
While the user can customize their Serverless cluster name, they are not in control of their tenant name. For example, say that they picked the Serverless cluster name of "mycluster". Their tenant name would be "mycluster-100", where 100 is their tenant ID. So we still have the option of recognizing tenant names by using a pattern like "XXX-NN". What I was objecting to above is using the pattern "XXX-YYY-NN", which is not guaranteed.
Once we support moving tenant clusters from host to host, it won't be enough to only have the tenant name (e.g. dim-dog-100
) in the database name. Instead, we need the full routing ID (e.g. dim-dog-100.7tc
). As laid out in the RFC, this means that we could see this in the future:
postgresql://andyk@dim-dog-100.7tc.cockroachlabs.cloud/dim-dog-100.7tc.defaultdb
The non-ambiguous way to parse the database name in this case would be to assume that the database name is always the part after the last period (i.e. not that the cluster name is always the part before the first period). Database names with dots would therefore result in a cluster routing error (if SNI and PG option are not present). We could give users a workaround by allowing quotes to be used around the database name (e.g. dim-dog-100.7tc."my.db"
), or we could change the expected format for the routing ID when used within a database name (there's no backwards-compat problem here b/c we don't yet support routing IDs in database names).
For example, we could say that when a routing ID is used in a DNS name, it's in the form dim-dog-100.7tc
, but when it's used in a PG option or database name, it's dim-dog-100-7tc
. For example:
postgresql://andyk@dim-dog-100.7tc.cockroachlabs.cloud/dim-dog-100-7tc.defaultdb
However, I expect users to get confused by a difference like that, so there's a usability tradeoff here. We were hoping to publish cluster routing IDs in the Cloud dashboard, so that customers can conveniently use this and so that Support can easily look up clusters. If the routing ID takes a different form depending on where it's used, it's going to get confusing...
The non-ambiguous way to parse the database name in this case would be to assume that the database name is always the part after the last period
This is not the only way to do this.
Could we instead consider constraining routing IDs to always match a regexp we'd decide in advance? We could then make this an optional part of the cluster name specification.
Can you give more details on what you mean?
Today, the regex for a routing ID would roughly be:
XXX-NN.YYY
Constraining it further would be problematic, as we've already rolled them out, and they're part of important user-visible and aesthetically-important connection strings. Also, remember that we have today's tenant name that we need to continue supporting for backwards-compat, which has this form: XXX-NN
. The Routing ID just adds an additional part to the tenant name in order to let us transparently move tenants from host to host (see RFC for more details).
What I'm requesting is to agree that any current and future routing ID has a structure we can recognize already today (i.e. we can describe today the structure of all future tenant name + routing combinations)
The routing ID is described in the RFC I referenced (though note that we used the serverless cluster name rather than the constant string "serverless" as described there), with this structure:
<serverless-cluster-name>-<tenant-ID>.<host-cluster-short-id>
The only part under end-user control is the serverless-cluster-name
, which is further constrained by this regex:
clusterNameRegex = regexp.MustCompile("^[a-z0-9][a-z0-9-]{4,98}[a-z0-9]$")
As far as we know today, this routing ID is sufficient for our present and future needs. Can you give a specific example of what you're proposing we change (while still preserving a high-level of backwards-compat)?
What I propose is to reserve an open name space for any and all future extensions besides / in addition to the short ID, in a way that guarantees space for a composite database name using periods afterwards.
The problem I see is the following: again with this host ID we're looking at a closed namespace, and if we ever need to extend we will once again infringe on the space available for database names.
Let me replay this for you.
When this work started we had a regexp for cluster name + tenant ID.
This was already a closed namespace—it did not leave any room for extension:
Within that namespace, the only possible way to add extensions would have been to reserve certain tenant IDs as marker for extensions, and then play "under" that, for example by saying "if the tenant ID part starts with a zero, then the string before that should be interpreted as, say, extension name then hyphen then cluster name".
However, in the new RFC you didn't go this way. Instead, you carved another namespace: a period followed by the short ID. By doing so, you chipped once again away at the realm of possible database names containing periods. And the result is still a closed namespace: any short ID that matches the short ID regexp is a valid routing ID. Again there's no room for extensions.
So I predict the next time we will need to do something you will again suggest using a period or something similar.
This is not good namespace design.
The proper way to do this is the following:
crdb:
, ends with a period, no period in the middle".crdb:v1
starts the v1 format which includes only the cluster name and tenant ID, separated with a hyphen (crdb:v1-clustrername-tenantID.
)
Then, had we done that, we could then have said that crdb:v2
starts another format including the routing ID within the same namespace, with another hyphen (either crdb:v2-routingID-tenantID
or crdb:v2-tenantID-routingID
).The customer for the "outer" format is for the other layers of the stack, which are not and need not be aware of our infrastructure. The customer for the "inner" format is our own infrastructure.
As it stands, your standpoint so far has been "everything up to the last period in the database name is reserved as routing information" and that is unacceptable. We need something better.
I would be OK with a format that's a little bit more complex such as "either one or two periods, as long as the routing ID has some structure" but that needs to be the end of that complexity, and so we must ensure that format has space for extensions. It does not seem to currently have space for extensions.
Let's continue this convo under the suggestion from Jeff. @JeffSwenson could you assist here?
- If SNI is provided and the DNS name matches the cluster's dns zone (e.g. something like *.7tc.cockroachlabs.cloud), then the SNI must match a tenant or it is rejected as not found.
- If SNI does not match the zone or is absent, then the option is used.
- Only if SNI and the option are missing, will the sql proxy attempt to parse the name from the DB. At this point the tenant name in the db name is a fall back for a fall back, so I think it is fine if the behavior is a little weird. If someone actually runs into this, they can work around it by creating a DB with no "." and using the new DB name in the connection string.
Can we spell out the rules and the regular expressions a bit more precisely?
What is the precise lexical structure of the short routing ID? I can't find that in the RFC. Is it always 3 characters long? Is there a leading character or prefix? (there should be one for future extensibility, as per discussion above)
How does the proxy know what "the cluster's DNS zone" is for rule (1)? Is there a parameter? I can't see this in the proxy source code.
If there is no parameter, we could possibly match a prefix of the SNI name with the routing regexp. For example, we could say that the regexp to match routing in SNI is always \l[\l\d-]+-\d+\.
followed by the regexp for the short routing ID, once we agree on one.
for rule (2) we need to clamp down on what "the option" means. The current logic uses an option named --cluster
, but the RFC refers to --routing
. Which one is it? If we consider them both, in which order of precedence should they be considered?
for rule (3) we now have two regular expressions to match at the start of a database name:
\l[\l\d-]+-\d+\.
, which is just clustername-tenantID. (I don't like it, but it is what it is. I won't litigate that one further.)serverless-\d+\.<routing ID>\.
, which is the new format suggested by the RFC (and this still needs a precise regular expression for the routing ID, as per my question above). This one is actually pretty good, because it has both a prefix (serverless-
) and a fixed "envelope" (just two periods). My understanding is that IF there's no routing information as per rules (1) and (2) THEN we attempt these two regular expressions; if they succeed, we strip that prefix and keep the remainder of the db name (including periods!). If they fail, we keep the entire db name (including periods). That would work.
Do we want to carve an additional reserved namespace, as per the discussion above? I would like to ensure it's precise enough that we don't need to greedily absorb more than 1-2 periods at the beginning of the db space.
How does the proxy know what "the cluster's DNS zone" is for rule (1)? Is there a parameter? I can't see this in the proxy source code.
It doesn't know this today. We would need to provide the proxy with a list of DNS suffixes that enable SNI. The SNI enable list would contain all DNS suffixes that may be routed to the cluster.
Alternatively, we could supply a list of DNS suffixes that disable SNI. We would use the disable list to ignore SNI strings that contain the old "freetier" dns names.
serverless-\d+.
., which is the new format suggested by the RFC (and this still needs a precise regular expression for the routing ID, as per my question above). This one is actually pretty good, because it has both a prefix (serverless-) and a fixed "envelope" (just two periods).
We ended up dropping the Serverless prefix and replaced the prefix with the cluster's name. Example oiled-husky-37.6s4p
. The format of this string is <cluster-name>-<tenant-id>.<host-cluster-id>
.
Do we want to carve an additional reserved namespace, as per the discussion above? I would like to ensure it's precise enough that we don't need to greedily absorb more than 1-2 periods at the beginning of the db space.
If we introduce namespace now, we will still need to support thousands of production clusters that do not use the new namespace. I see two parts of this we need to be look out for:
Looking at the string "foo-bar-12.abcd.bannana". It is unclear if it should be parsed as:
@andy-kimball I'm skeptical anyone knows that they can route using the routing id. All of the documentation I can find refers to the
If we remove the extra '.' in the routing ID, then the cluster and database names are unambiguous. The segment up to the first '.' is the cluster name and the remainder is the database name.
We could adjust the control plane to avoid the creation of non-unique (cluster-name, tenant-id) pairs. Name conflicts are unlikely, and we can avoid them by incrementing the tenant id until a non-conflicting id is found. For the small handful of tenants that exist with ambiguous names, we can avoid migrating them to the same host cluster, or if it looks like the cluster is unused, we could rename the cluster's tenant ID on migration.
If we see the string "foo-12.bar" and tenant "foo-12" does not exist, we don't know which error should be returned to the client. It may be they forgot the tenant and "foo-12.bar" is the database name, or it may be they mistyped the tenant and foo-12 does not exist. I think we can handle this scenario by including the parsed representation in the error message. E.g. return an error like: 'Tenant "foo-12" does not exist. No SNI or cluster option was found. Parsed the tenant from the database parameter "foo-12.bar"'
It looks like parsing <cluster-name>-<tenant-id>.<host-cluster>.<database>
was never implemented: https://github.com/cockroachdb/cockroach/blob/4b3f5822219a5b0155725ef2ede3ec33f46dfc48/pkg/ccl/sqlproxyccl/proxy_handler.go#L765. So we don't even need to worry about breaking anyone.
was never implemented:
The way I understand it is that the RFC proposal was accepted and the work to modify sqlproxy is currently queued. Are you suggesting the discussion at hand will preempt that work?
So we don't even need to worry about breaking anyone.
Yes, that was never implemented. Today, we can only route by<cluster-name>-<tenant-id>.<database>
. If we do implement the RFC proposal + tenant migration, we need to make that change in sqlproxy.
If we remove the extra '.' in the routing ID, then the cluster and database names are unambiguous. The segment up to the first '.' is the cluster name and the remainder is the database name.
It's ambiguous just with a single period. Because the tenant name is optional in the database name, there's no way to tell whether foo-100.bar
is a cluster name followed by database name, or just a database name with a period.
Therefore, we're already in a place where there's some ambiguity, and need to continue supporting existing connection strings. However, as @JeffSwenson said, we haven't implemented parsing of the <cluster-name>-<tenant-id>.<host-cluster>.<database>
format yet, so we don't have backwards-compatibility concerns there.
I'm very concerned about the aesthetics of the various proposals (I think that's just as important as unambiguity), but as Jeff said, embedding the tenant name in a database name is a "fallback for a fallback", so aesthetics are less of a concern there. I'd be OK with a format like:
crdb:dim-dog-100.5xj.my.db
If the database name starts with crdb:
, then we'd expect that to be followed by a routing ID. Any parts beyond that would be the database name. We might also consider using cluster:
as the prefix, to match the --cluster
option.
Here are a couple more answers:
The current logic uses an option named --cluster, but the RFC refers to --routing
Today, it's --cluster
, but I was proposing we accept --routing
as well. I'm backing off that now, though - I think --cluster
is fine, and we should just not bother accepting --routing
.
serverless-\d+.
., which is the new format suggested by the RFC
The RFC suggested we move to a constant serverless
string, not in order to avoid ambiguity, but b/c we wanted to support customers renaming their Serverless clusters. However, later I realized that we could still support renaming even if we use the cluster name in the connection string, by temporarily routing the old name and the new name to the same place. It's much friendlier to embed the cluster name in the connection string, and it also gives us a small security boost, since bad actors need to know the cluster name in order to "wake up" a cluster. So, when we actually implemented the RFC, we decided to use the cluster name rather than serverless
.
As it stands, your standpoint so far has been "everything up to the last period in the database name is reserved as routing information" and that is unacceptable. We need something better.
Note this is exactly how we parse qualified names in SQL - you need to use quotes to disambiguate when you use dots in names, e.g.:
CREATE TABLE myschema."my.table"
In this case, everything after the last period (outside of quotes) is the table name, and not using quotes will cause ambiguities. I think it was perfectly reasonable to use similar rules when we decided to qualify database names in PG connection strings. Using a prefix to disambiguate rather than using SQL qualified name rules is just your personal preference and not the only "acceptable" solution.
this is exactly how we parse qualified names in SQL - you need to use quotes to disambiguate when you use dots in names
"This is exactly how" is factually incorrect. The SQL dialect has the benefit of opt-in quotes to disambiguate. The connection URL does not have that luxury.
I'll reproduce jeff and i's argument in my next comment.
The connection URL does not have that luxury.
We chose to extend what's acceptable in the connection URL. We have no requirement to be exactly compatible there in every edge case. I think your underlying assumption is incorrect - that we have to exactly replicate the semantics of PG connection strings in our Serverless offering. I disagree that this is one of our requirements.
Here's what Jeff and I discussed
in CC serverless, regarding the ergonomics of routing:
--cluster
option is our main documented fallback.regarding cluster names:
^[a-z0-9][a-z0-9-]{4,98}[a-z0-9]$
, i.e. they can't contain a period and never start or end with a hyphen, with a min of 6 characters and a max of 100.^[a-z0-9]([a-z0-9-]{0,98}[a-z0-9])?$
(i.e. same overall structure, but minimum of 1 character instead of 6)regarding routing to the right host cluster:
<cluster_name>-<tenant_id>.<original_host_id>
to a specific tenant cluster within the host. The tenant directory stores the mapping in the k8s api server and the api server state is maintained by the Cockroach Cloud control plane.In serverless proxy, we would change the routing logic as follows:
First look at provided pg options.
If --cluster
is present, use that. Expect it to match the structure <clustername>-<tenantid>(\.<routingID>)?
. Routing ID is optional in that case. Its separator is .
which can never occur inside a cluster name. Don't look at anything else. The client-provided database name is preserved as-is.
Note: the specific syntax to embed the routing ID in --cluster
does not need to be finalized at this time. We can just say that --cluster
has way more extension points than the db name because there is no prior use.
<clustername>-<tenantid>(\.<routingID>)?
free-tier.gcp.xxx
(that don't also include a --cluster
option) because free tier URLs do not contain a number after the free-tier
prefix.<clustername>-<tenantid>
, use that (this is the free tier compatibility case).Examples for serverless:
URL | Resulting Configuration | Notes |
---|---|---|
postgres://10.20.30.40/foo.bar?options=--cluster=sometenant-100 |
tenant=sometenant-100 dbname=foo.bar |
Cluster option only |
postgres://10.20.30.40/foo.bar?options=--cluster=sometenant-100.xyz |
tenant=sometenant-100 host=xyz dbname=foo.bar |
Cluster option only |
postgres://10.20.30.40/blah-100.baz?options=--cluster=sometenant-100 |
tenant=sometenant-100 dbname=blah-100.baz |
Cluster option prevails over db name |
postgres://free-tier.gcp/blah-100.baz?options=--cluster=sometenant-100 |
tenant=sometenant-100 dbname=blah-100.baz |
Cluster option prevails over db name |
postgres://free-tier.gcp/blah-100.baz?options=--cluster=sometenant-100.xyz |
tenant=sometenant-100 host=xyz dbname=blah-100.baz |
Cluster option prevails over db name |
postgres://mytenant-100.gcp.xxx/blah-100.baz?options=--cluster=othertenant-100 |
tenant=othertenant-100 dbname=blah-100.baz |
Cluster option prevails over SNI |
postgres://mytenant-100.gcp.xxx/blah-100.baz |
tenant=mytenant-100 dbname=blah-100.baz |
SNI prevails over db name |
postgres://mytenant-100.xyz.gcp.xxx/blah-100.baz |
tenant=mytenant-100 host=xyz dbname=blah-100.baz |
SNI prevails over db name |
postgres://free-tier.gcp.xxx/mytenant-100.blah-100.baz |
tenant=mytenant-100 dbname=blah-100.baz |
Unambiguous mandatory routing prefix in db name |
postgres://free-tier.gcp.xxx/mytenant-100 |
tenant=mytenant-100 dbname=defaultdb |
Unambiguous mandatory routing prefix in db name |
In the dedicated/SH multitenant case, we would implement the routing logic as follows:
--cluster
is present, use that. Expect it to match the structure <clustername>
. Don't look at anything else. Client-provided dbname is unchanged.server.pre-serve.sni-routing.suffix
cluster setting.
<clustername>
. If it matches, use that. Then don't look at anything else. Client-provided dbname is unchanged.server.pre-serve.sni-routing.suffix
is non-empty but does not match), look at (NEW) server.pre-serve.sni-routing.fallback.enabled
. If disabled, stop here with an error. Otherwise, fall through.cluster:<clustername>/
, strip that prefix and use the cluster name enclosed therein.To support the last rule above, we would also modify the SQL dialect to reject the creation of any DB with a name that matches ^cluster:[^/]*(/$)
We find this restriction to be much less likely to be backward-incompatible with existing single-tenant deployments than blocking the entire <clustername>\.
prefix namespace in db names.
This will also provide us a mechanism for extension, just like the separation using /
in --cluster
.
Examples for dedicated/SH:
URL | Resulting Configuration | Notes |
---|---|---|
postgres://10.20.30.40/foo.bar?options=--cluster=sometenant-100 |
tenant=sometenant-100 dbname=foo.bar |
Cluster option only |
postgres://10.20.30.40/foo.bar?options=--cluster=sometenant-100.xyz |
connection error (routing ID not supported) | Cluster option only |
postgres://10.20.30.40/blah-100.baz?options=--cluster=sometenant-100 |
tenant=sometenant-100 dbname=blah-100.baz |
Cluster option prevails over db name |
postgres://mytenant-100.gcp.xxx/blah-100.baz?options=--cluster=othertenant-100 |
tenant=othertenant-100 dbname=blah-100.baz |
Cluster option prevails over SNI |
postgres://mytenant-100.gcp.xxx/blah-100.baz |
tenant=mytenant-100 dbname=blah-100.baz |
SNI prevails over db name (if server.pre-serve.sni-routing.suffix is gcp.xxx ) |
postgres://mytenant-100.xyz.gcp.xxx/blah-100.baz |
tenant=mytenant-100 dbname=blah-100.baz |
SNI prevails over db name (if server.pre-serve.sni-routing.suffix is xyz.gcp.xxx , otherwise error or fallback) |
postgres://someaddress/cluster:mytenant-100/blah-100.baz |
tenant=mytenant-100 dbname=blah-100.baz |
Unambiguous presence of routing prefix in db name. |
postgres://someaddress/cluster:mytenant-100 |
tenant=mytenant-100 dbname=defaultdb |
Unambiguous presence of routing prefix in db name. |
postgres://someaddress/blah-100.baz |
tenant=<default> dbname=blah-100.baz |
Unambiguous absence of routing prefix in db name. |
How to find a host routing ID in the fallback of the fallback in the database name? Remember in that case (we're in the fallback) the routing info must be present so we don't have to support the case where it's not included.
In that case, separate with additional punctuation. This is possible because pg db names can contain extra punctuation. Some candidates:
postgres://../cluster:<routingid>:<clustername>-<tenantid>/<dbname>
(compatible with the proposal above))postgres://../<routingid>:<clustername>-<tenantid>.<dbname>
(using just a colon)postgres://../<clustername>-<tenantid>:<routingid>.<dbname>
Overall, this proposal would be acceptable to me. Can you put it in a Google doc, so that it's easy to make comments on specific details?
PR with finalized RFC #93537
Describe the problem
If I do
CREATE DATABASE "my.db"
, I am unable to connect to the dbmy.db
using a SQL clients in CC serverless, for example with a URL of the typepostgresq://...@.../mycluster.my.db
The problem is that the sql proxy decomposes the provided db name using
strings.Split
, and then discards any string after the second period. So if the provided db name ismycluster.my.db
, the result of the db name parsing is cluster ==mycluster
, dbname ==my
, and the.db
suffix is stripped.To Reproduce
see above
Expected behavior
Complex db names with periods in their name should be useable by SQL clients.
cc @jaylim-crl @JeffSwenson for triage
Jira issue: CRDB-21837 Epic: CRDB-22385