Closed kennytm closed 3 years ago
External stats should be saved as gzip-compressed JSON files containing
{
"version": "1",
"stats": [ /* ... []*handle.JSONTable ... */ ]
}
The Stats
field in backup.Schema
is changed to an oneof
containing the original member as inlined JSON stats, and a new member as the file name to the external stats.
message Schema {
...
oneof stats {
bytes inlined = 7;
string external = 8;
}
}
When restoring, if external
is filled, we read that external .json.gz file, locate the []*handle.JSONTable
with the corresponding database_name
& table_name
, and then call LoadStatsFromJSON
. Otherwise, if inlined
is filled, and deserialize it into *handle.JSONTable
directly. Otherwise, we follow #679.
This scheme is both backwards- and forwards-compatible:
inlined
is filled, so we load the inlined JSON stats.external
is filled, so we load from external JSON file.inlined
(original stats
in 4.0.9) is missing, so stats will not be restoredHow the .json.gz files are populated is not yet designed. We could just have one file per table, though this will create thousands of tiny files. So we could collect "enough" tables into a single .json.gz file (which is why stats
is an array). But restoring may need to keep reopening the same file, increasing the cost of restore. So we need to do caching, whiling needing to keep RAM usage low, and bam we hit one of the two Hard Things™ in Computer Science. Anyway perhaps we should just start with thousands of tiny files and optimize later.
- How the .json.gz files are populated is not yet designed. We could just have one file per table, though this will create thousands of tiny files. So we could collect "enough" tables into a single .json.gz file (which is why stats is an array). But restoring may need to keep reopening the same file, increasing the cost of restore. So we need to do caching, whiling needing to keep RAM usage low, and bam we hit one of the two Hard Things™ in Computer Science. Anyway perhaps we should just start with thousands of tiny files and optimize later.
Tiny files may slow down backup, I prefer write all stat to one file. JSON format is optional, we need a file format supports append (for writing) and seek (for reading).
@overvenus in that case we save the JSONs into a ZIP archive (synchronized between cloud and a temp dir)?
@kennytm , can you help sort out the organizational design of the backup data, including what format is used to store different backup data, how to name the files, and how to divide the files?
Our future backup data format adjustments can be based on this design document, and everyone will consider the design information more carefully.
it looks like just starting the stats worker is going to use 1G+ memory (#693), so there's high chance we need to directly read from the mysql.stats_*
table 😞.
I always want to ask, why design a backup mechanism for stats instead of backing up other table data in the same way?
@IANTHEREAL there are two concerns involving system tables:
mysql.*
tables, so we can't guarantee say a 4.0.0 system table can be correctly restored to 4.0.11. (this can be fixed by copying the upgradeToVerXXXX
functions from bootstrap against the tables before restoring).table_id
s.@kennytm Although I don't know the specific details, I understand the difficulties. I see that you are already designing a plan to back up system tables and statistics. I think it is a very good plan, we can try it.
Closing in favor of https://github.com/pingcap/br/issues/679#issuecomment-762592254.
Since 4.0.9, we store the stats JSON directly inside backupmeta. This works well for small clusters, but fails catastrophically when the database size and table count becomes large.
The biggest¹ issue is the CMSketch field of the stats, which consists of 5 × 2048 = 10240 integers by default, and every column and index has its own CMSketch. This means the JSON serialization occupy at least 20 KB per (column + index) in the backupmeta file. In a large cluster with thousands of tables this makes the file too big to reliably transmit through cloud storage, and also risk OOM error when fit into memory.
Therefore we need to revisit our encoding scheme of stats.
¹: pun intended