Open YuJuncen opened 2 years ago
For now, I would prefer the sequence method for naming...(Even we involved the NextBackupTS
and unix timestamp for naming the metafile, there still be chance for name conflicting, e.g. when the time is drifting, maybe monotonic clock can help?) 🤔
cc @kennytm @3pointer @joccau
File Tree of Log Backup
Design
Files and Naming
There should be 2 types of files: MetaFiles and LogFiles, MetaFiles contain the metadata of a set of LogFiles, in another way, they acting as the index of the LogFiles.
{Prefix} contains the version of the backup directory structure. (e.g. /v1).
The MetaFiles should be saved at path
{Prefix}/backupmeta/{MinReslovedTSOfFiles:0??}{StoreID:06}{UUIDv4:32}.meta
. Where thePrefix
is the user-defined external storage path.For speeding up restore a subset of tables, the LogFiles would be stored at
{Prefix}/t{TableID:06}/{MinTSOfFile:0??}{StoreID:06}.log
.Specially, for change log of schema info (
m
prefixed keys), we store it at{Prefix}/m/{MinTSOfFile:0??}{StoreID:06}.log
.Another way:
Make a file named
SEQUENCE
in each{StoreID}
directory of external storage, which's content is a 8-byte BE number. And we increase the content of the file each time we want to save some file.Then the MetaFile name can simply be
{ResolvedTSOfFiles:0??}{SequenceNumber:010}.meta
(Resolved TS is only for speeding up restore: we can find needed MetaFiles easier), and the name of LogFiles can simply be{SequenceNumber:010}.log
.The allocation could be batched, hence we involved 2 read + 2 writes more for each flush(we can also only create one
SEQUENCE
file in the metadata directory, which can reduce the cost to 1 read + 1 write). No lock is needed, because the sequence number is shraded by the store ID."Routing"
When an event (aka
RaftRequest
) was observed by TiKV, it would be routing by a chain of routers, to the file the event should be store: each of the router defines a part of the final file path.{Prefix}
part.{TableID}
part.{RegionID}
part.The terminus of the routing is a local temporary file (No flush required). The file would be copied to external storage at the "Flush" then.
The
MinTSOfFile
part can be stored in memory by the last chain of the routers, or trivially TS of the first key observed."Flush"
Each store maintains a set of local files at a temporary directory, once the total size of them exceeds
128MB
or5 mins
passed, the store would make a "flush" to the external storage.NextBackupTS
to theMetaStore
(etcd in general), as a cache for querying the progress.Once the metadata file get uploaded, the "flush" is finished, updating
NextBackupTS
inMetaStore
is optional.File Format
Each metafile is generated by a 'flush,' contains the metadata of all files involved by this 'flush'.
The content of MetaFiles is encoded as protocol buffer, with the definition:
The LogFiles should be encoded as a plain stream of key-value pairs. The format would be like: