Open gregwebs opened 5 years ago
It is possible to try to do this by feeding data through stream-based tools to lightning. However, lightning currently expects a directory and the data may be stored as a gzip tarball.
So tidb-lightning needs to either handle the unpacking itself or be able to continue to look for new files showing up in the directory until it receives a signal that all is finished. It may also be possible to achieve this by running lightning separately for each table using table filter techniques.
🤔 I assume this is more than #69?
Our new BR tool streams data to and from cloud storage nicely.
Feature Request
To restore a large amount of data, if we cannot stream it we must wait for the entire backup to be downloaded from object storage. Our restore process should instead look like this, which will make it quicker and reduce disk requirements (cost):
1) Download the metadata 2) Download chunks of data 3) Apply the downloaded chunks of data while downloading new ones. Backpressure from applying should limit the downloading