Open itsbilal opened 10 months ago
Some more nuance from @msbutler:
I’m also not sure how frequently “virtualized” snapshots (i.e. send sst metadata with uri instead of the actual sst) will be used during the OR download job: currently in dissagregated storage cluster, the sender only creates a virtualized snapshot if all SST’s in L5/L6 are shared (i.e the physical ssts’s belong in s3), else the sender falls back to old style snapshots. In a disaggregated storage cluster, nearly all snapshots are virtualized, since nearly all SST’s in L5-L6 are shared. But in a normal cluster, the OR download job and pebble compaction will download those L5/L6 files ASAP, leading to fewer opportunities to conduct virtual snapshots. Further, given how wide of a key span SSTs in L5 and L6 are, as soon as one or 2 of these files materialize, it seems quite unlikely we can take advantage of virtualized snapshots because a replica key span will likely intersect with a downloaded sst.
I'm hopeful that this won't be necessary for online restore if we presplit ranges appropriately. It would increase the overall complexity of the initial online restore preview considerably. It's much easier to reason about it if there's one mechanism to link these external sstables into the LSM, and it's during the restore linking phase.
Similar to shared files, two Pebble instances can reduce bytes scanned / transferred by sharing file metadata or location info and then ingesting them as external files + some local files that contain the diff between the external files and the intended state.
As part of this issue, explore if such functionality makes sense for CockroachDB's use-cases, and if it does, update
ScanInternal
to see external files as shareable files for "skip-shared" iteration mode.IngestAndExcise
will also need to be updated to support taking external files instead of shared files.Jira issue: PEBBLE-77
Epic CRDB-40359