pingcap / docs

TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://docs.pingcap.com
Other
584 stars 678 forks source link

Expand TiDB glossary to include frequently used abbreviations and explanations #19212

Open qiancai opened 3 hours ago

qiancai commented 3 hours ago

Currently, the TiDB Glossary doc lacks explanations for some frequently used abbreviations such as PD, DM, and BR.

To enhance clarity and accessibility for users, we propose expanding the TiDB glossary to include commonly used abbreviations alongside existing product-specific terms. This update will help users better understand technical documents and reduce potential confusion, especially for new users.

Plan:

  1. Select key abbreviations that need definitions based on the list of frequently used acronyms.
  2. Provide clear explanations for the selected abbreviations: Each abbreviation will be accompanied by a concise explanation to ensure users understand its meaning and relevance to TiDB.
qiancai commented 3 hours ago

I’ve created a script to list the 100 most frequently used abbreviations in the following table. I suggest we start by selecting some key abbreviations from this table.

Note:

  • Only a few of the abbreviations in the table have already been covered in the current Glossary doc: TSO, MPP, VPC, MVCC.
  • Certain abbreviations, like SQL, CPU, and ID, are widely understood and don’t require explanations.
Acronym Number of occurrences
SQL 3730
PD 2506
DDL 1974
DM 1908
CPU 925
ID 880
API 756
BR 724
AWS 655
GC 626
JSON 532
DML 519
TLS 464
GLOBAL 462
IP 449
TSO 431
CSV 353
KV 352
MPP 351
SESSION 349
QPS 340
VPC 301
OOM 277
URI 276
TPC 263
TTL 241
PITR 229
GA 222
SHOW 221
CLI 220
CA 216
SST 209
HTTP 204
GCS 187
HTAP 170
CREATE 163
JDBC 156
URL 150
GB 147
NOT 135
OLTP 130
RPC 128
TPS 127
RU 122
TABLE 120
DR 120
MB 116
NULL 111
SSO 108
IAM 106
AI 105
DXF 105
DMS 104
DMR 100
CTE 98
KMS 98
ARN 98
AZ 98
DROP 96
ORM 95
MVCC 90
TS 87
UI 87
SSL 82
IO 80
LDAP 79
GMHDBJD 79
OLAP 76
SSH 75
OPS 75
GTID 74
INDEX 74
FAQ 69
ALTER 69
CF 69
CIDR 65
RDS 65
CDC 62
SET 59
SSD 57
COLUMNS 56
GROUP 55
DB 54
WAL 53
DNS 50
MSP 50
UPDATE 49
MQ 49
TCP 48
NUMA 47
LTS 46
UTF 43
UUID 41
PROXY 41
DECIMAL 41
RESOURCE 40
OIDC 40
PLACEMENT 40
ANALYZE 39
qiancai commented 3 hours ago

Here, I’ve selected 20 common abbreviations from the previous table as candidates for the Glossary document. @lilin90 and @dveeden PTAL, thanks.

dveeden commented 2 hours ago

I assume the list is based on things that are all uppercase? Maybe we should add things like PoC, QoS and IdP as well?

qiancai commented 2 hours ago

I assume the list is based on things that are all uppercase? Maybe we should add things like PoC, QoS and IdP as well?

@dveeden Good catch. Thank you! Here they are. Let's pick some candidates from the following table too.

Acronym Number of occurrences
TiDB 20520
TiKV 4565
MySQL 2573
TiCDC 2168
TiUP 1197
RocksDB 420
PingCAP 228
gRPC 184
IDs 180
vCPU 169
MariaDB 140
FAQs 140
HAProxy 139
ProxySQL 103
MDSvgIcon 103
macOS 92
CentOS 89
OpenAPI 87
RawKV 55
SQLAlchemy 40
DTFile 39
PyMySQL 38
VCore 38
RCUs 33
DBeaver 32
HunDunDM 31
OAuth 29
URLs 28
SQLTools 25
WebUI 24
TypeORM 24
DMLs 23
InnoDB 21
SSDs 19
KVs 18
NVMe 18
RowID 17
OpenAI 17
DBAs 17
CPUs 16
BenchmarkSQL 16
ResolvedTS 16
benCHmark 14
NewSQL 14
ksqlDB 14
HikariCP 14
CheckpointTS 14
mTLS 12
URIs 12
OpenSSL 11
CFs 11
MBps 11
MinIO 11
AmazonRDS 10
DBaaS 10
BackupTS 10
AutoID 10
TiKVs 9
gPRC 9
vCPUs 9
GPTs 9
StartTLS 8
HundunDM 8
DTFiles 7
DTTool 7
GTIDs 7
SSTs 7
RESTful 7
DMFile 6
PostgreSQL 6
ORMs 6
VPCs 6
MyCLI 6
SELinux 6
AskTUG 5
CMSketch 5
KvDB 5
RaftDB 5
PRs 5
PDs 5
OpenJDK 5
MQTh 5
TiEM 5
JSONPath 5
PIDs 5
KEYs 4
CAs 4
NICs 4
PCIe 4
NoSQL 4
kvDB 4
PromQL 4
HAproxy 4
RockDB 2