Closed jseldess closed 2 years ago
@robert-s-lee, how important do you think this is at our phase?
This comes up often in conversation with app developers as things not to do.
See https://github.com/cockroachdb/docs/issues/380 for answers from 2016.
If a single row takes up all of a range, however, we won't split the range but rather let it get larger than the max limit.
Note that this is no longer true. We now block writes to the offending range when it gets too large. It's still important to stay out of this situation, but it will no longer destabilize the rest of the cluster.
Another issue asking or physical maximum limits for column names, table names, row width: https://github.com/cockroachdb/docs/issues/7280
And a comment from @ajwerner there:
With regards to names, in light of cockroachdb/cockroach#48443, we should document that while there is currently no limit, 63 characters is a good guideline and may be enforced in future versions. Longer names will not fundamentally cause a problems especially if the names remain shorter than kilobytes.
Another suggestion that we need this, from @a-entin:
Question re "System Limits" doc page. I don't think we have one (pls point if missed) and wonder what are the thoughts on that, maybe there is already a plan for it?
The idea is 2 have 1 place to check all kinds of limits a user can bump into. Most databases have that, supper helpful. 3 diff database examples: https://docs.oracle.com/cd/B28359_01/server.111/b28320/limits.htm https://docs.oracle.com/cd/E18283_01/timesten.112/e17114/limit.htm https://docs.memsql.com/v7.1/reference/configuration-reference/system-limits/
Also, there's this very old PR started by @knz that could help.
@johnrk, @taroface, I think it's time for us to prioritize this. Although there are still no actual, hard limits, we should be able to provide guideposts based on our own internal testing and what we know from customer usage. Perhaps telemetry would help here. I think part of the work is defining what dimension to even document limits for. I imagine @a-entin can help.
More from @a-entin:
I would change "production limits" to more directly "product" or "system" limits... production alludes to capacity planning aspect and we need product characteristics documented as first order, imho. I'm pretty sure we have some hard limits, but even if "no limit", it's really helpful to say this explicitly on essential parameters where limits are expected by users. It will eliminate a lot of support chatter. We can build the list of parameters to list. The 3 examples are pretty good / common sense and you'll see a lot of correlation
@chudro let me know what topics we should add here! Thanks.
@taroface The SE's are in the midst of capturing the topics that should be listed in a product limits page. Here is a link to a google spreadsheet providing that list as it is being captured. There is a separate tab providing a list of example db websites and how they provide similar lists: https://docs.google.com/spreadsheets/d/1JWqggZ1tZ_wDI_lrvWdEBtge4hQFvXEZYczQtMhkRjI/edit?usp=sharing
@taroface this Cockroach Cloud user requires this type of information: https://cockroachdb.zendesk.com/agent/tickets/6546
I am looking answers for below questions: 1) How much maximum columns are supported by a CRDB table? 2) How much maximum size of individual data can be stored in CRDB cell? 3) How much maximum size a table can store? 4) Maximum number of tables in one database? 5) Maximum number of databases per CRDB cluster?
Your answer will help me to architect my applications.
Hi @florence-crl , any update on above questions?
Also to cover:
Number of tables, columns, indexes, user-defined schemas, and databases we support Row size limits (especially JSON sizes), range size limits Cluster storage limits in cluster Number of nodes in cluster Number of connections support per host
Verbiage from @jhatcher9999: https://cockroachlabs.slack.com/archives/CHVV403F0/p1610553560090500?thread_ts=1610550065.087100&cid=CHVV403F0
I believe the 64MB range limit is changing for 21.1 (or has changed already) - needs to be verified.
The default range size limit changed from 64MB to 512MB in 20.1.
@mikeCRL, thanks for taking this on! This is an old issue, so it's helpful to articulate next steps:
Update: I've worked in that limits sheet, adding a priority column, some analyses of competitor docs, proposals for docs, and questions. SEs have jumped in with some responses. I have also mentioned this work in the context of a CC thread on gathering large cluster data.
@a-entin can I ask you, too, to please give the sheet and specifically the Questions tab a review? If you have any leads on any of the highlighted high-priority data rows, or anything else to contribute, I'd appreciate it.
Notes from my visit to bdarnell
's Chief Architect Office Hours.
a-entin
has one or two recorded pres. where he talks about sizing; topology matters; how vcpu's matter.I made a few comments in the spreadsheet and also added a tab "JDBC Metadata Dump". JDBC allows the client to query database defaults and limits. I think we should at least be aware, perhaps true to what we report about ourselves. For example, we report getMaxConnections | 8192 getMaxColumnsInTable | 1600 getMaxRowSize | 1073741824
We also suppose to answer these: (currently no answer i.e. jdbc support is incomplete) getMaxSchemaNameLength getMaxTableNameLength getMaxCursorNameLength getMaxCharLiteralLength getMaxColumnsInGroupBy getMaxUserNameLength getMaxProcedureNameLength getMaxBinaryLiteralLength getMaxColumnsInIndex getMaxCatalogNameLength getMaxColumnNameLength
Also noticed we may not be entirely honest about getDefaultTransactionIsolation | TRANSACTION_READ_COMMITTED yet this is a different topic
It's become apparent that there is interest in documenting two separate types of things: hard limits (a well-defined list of system-specified limits, including clarity on what we do not limit) and best practices/considerations related to size and scale.
I clarified the split in our spreadsheet - which should remain SSOT for the foreseeable future - and am splitting this work into multiple issues:
CockroachDB | CockroachCloud | |
---|---|---|
Hard limits | #1830 | #10032 |
Size/scale BPs | #10031 | #10037 |
I have raised an initial draft PR for CRDB https://github.com/cockroachdb/docs/pull/10035. In this case, it touches on both of the above - limits and BPs - because there was a lot of info relevant to both gathered in this issue and the sheet. I found that there was a reasonable place for a lot of this existing info on the schema design page, given the focus on various DB object types.
Future efforts to gather information can occur via the more specific issues.
We have closed this issue because it is more than 3 years old. If this issue is still relevant, please add a comment and reopen the issue. Thank you for your contribution to CockroachDB docs!
Jesse Seldess (jseldess) commented:
We need a recommendation against letting a single row get close to 64MB (the default size at which we split a range). In addition to the current row data, all historical versions of the row that have not been garbage collected count toward the overall size.
Normally, a range contains many rows, in which case when the range gets to the limit, we split into 2 ranges. If a single row takes up all of a range, however, we won't split the range but rather let it get larger than the max limit.
Jira Issue: DOC-125