scylladb / scylla-manager

The Scylla Manager
https://manager.docs.scylladb.com/stable/
Other
48 stars 33 forks source link

docs: Issue on page Specification #3916

Open bhalevy opened 4 days ago

bhalevy commented 4 days ago

I would like to report an issue on page https://manager.docs.scylladb.com/branch-3.3/backup/specification

Problem

Nodes under e.g. meta/ or sst/ are identified by a UUID. I presume it's the node host_id (well, it should be :)) But the manifest file section mentions only that we keep the node's ip address, but not its host_id. Is that the case or is the document not up-to-date?

Nodes may change their ip address, but their host_id are stable as long as the node is alive and wasn't removed or replaced.

Also, sstable generation were changes to a uuid as well, and they are now globally unique. It would be good to update the documentation respectively.

Suggest a fix

Michal-Leszczynski commented 4 days ago

I presume it's the node host_id (well, it should be :))

That's true.

But the manifest file section mentions only that we keep the node's ip address, but not its host_id. Is that the case or is the document not up-to-date?

The documentation is up to date. In the past (restore via ansible script to the exact same cluster topology) the host ID was taken from the manifest file path - I don't know why it wasn't included in the manifest itself. Right now (restore via SM restore task), the host ID is not really used during restore - SM always takes all manifests with given snapshot tag and restores them.

The IP is only used for information/metrics purposes.

Also, sstable generation were changes to a uuid as well, and they are now globally unique. It would be good to update the documentation respectively.

That's true!

Michal-Leszczynski commented 4 days ago

@karol-kokoszka What do you think about adding the host ID to the manifest? In general, I think it wouldn't hurt to add the whole:

// ManifestInfo represents manifest on remote location.
type ManifestInfo struct {
    Location    Location
    DC          string
    ClusterID   uuid.UUID
    NodeID      string
    TaskID      uuid.UUID
    SnapshotTag string
    Temporary   bool // <-- without this field
}

to the currently backed up ManifestContentWithIndex (so that we back up the whole ManifestInfoWithContent).

karol-kokoszka commented 4 days ago

Nodes under e.g. meta/ or sst/ are identified by a UUID. I presume it's the node host_id (well, it should be :)) But the manifest file section mentions only that we keep the node's ip address, but not its host_id. Is that the case or is the document not up-to-date?

Nodes may change their ip address, but their host_id are stable as long as the node is alive and wasn't removed or replaced.

I don't think it's a significant issue that we don't store the host_id in the manifest file, as the path to the given manifest includes/defines the host_id. The UUID that @bhalevy mentioned is indeed the host_id.

However, it wouldn't be a big deal to add it. If we do add it, then all manifests that don't currently include the host_id will be deserialized with an empty string.

karol-kokoszka commented 4 days ago

@karol-kokoszka What do you think about adding the host ID to the manifest?

Sure, why not.