RING-44448 - export_s3_keys.sh improvements

TrevorBenson commented 1 year ago

RING-44448 - keep raw files option RING-44448 - determine the IP to query for bucketd RING-44448 - basic error logs for each RID and bucket

I'm opening this as a draft PR for us to discuss and for some initial testing.

It asks about raw files up front and keeps them all or none.
Uses ss to determine if port 9000 is bound to a specific IP or not.
- Matches [::]:9000for when ipv6 is enabled, or disabled via sysctl.
- Matches 0.0.0.0:9000 when ipv6 is disabled via grub and [::] will not appear in ss output.
There is a very basic log output for INFO, WARNING and ERROR.
- A basic RAW_COUNT vs. PROCESSED_COUNT is performed.
- Empty buckets are logged as WARNINGS.

Currently the script will report any failures that occurs into the RID_${RID}.log as well as the ${bucket}.log file, then it will exit with a specific RC code as defined in the script. This "fail early" logic will stop the processing of the RAFT session and report back which buckets have not yet been processed at the time of the ERROR and exiting the script.

TrevorBenson commented 1 year ago

Currently, the script has a hardcoded s3utils version which has to be bumped when the offline archive bundles a newer version.

Options:

Manage the script by a github action. When the s3utils version is bumped, the export_s3_keys.sh would get a the new versioned hardcoded.
1. Manage export_s3_keys.sh as a Jinja template. Would make the script deployed/updated by Ansible, potentially having an older version if Ansible is not used.
  - Check container hosts local registry for all s3utils versions. Determine the highest one, and use it.

Something like:

vercomp() {
    if [[ $1 == "$2" ]]; then
        return 0
    fi
    local IFS=.
    local i
    local ver1=($1)
    local ver2=($2)
    for ((i = ${#ver1[@]}; i < ${#ver2[@]}; i++)); do
        ver1[i]=0
    done
    for ((i = 0; i < ${#ver1[@]}; i++)); do
        if [[ -z ${ver2[i]} ]]; then
            ver2[i]=0
        fi
        if ((10#${ver1[i]} > 10#${ver2[i]})); then
            return 1
        fi
        if ((10#${ver1[i]} < 10#${ver2[i]})); then
            return 2
        fi
    done
    return 0
}

mapfile -t S3UTILS_VERSION < <(docker images registry.scality.com/s3utils/s3utils --format '{{ .Tag }}')

while [[ ${#S3UTILS_VERSION[@]} -gt 1 ]]; do
    element1=${S3UTILS_VERSION[0]}
    for ((i = 1; i < ${#S3UTILS_VERSION[@]}; i++)); do
        element2=${S3UTILS_VERSION[i]}
        vercomp "${element1}" "${element2}"
        case $? in
        1)
            S3UTILS_VERSION=("${S3UTILS_VERSION[@]:0:i}" "${S3UTILS_VERSION[@]:${i}+1}")
            i=$((i - 1))
            ;;
        2)
            S3UTILS_VERSION=("${S3UTILS_VERSION[@]:1:i}" "${S3UTILS_VERSION[@]:${i}+1}")
            i=$((i - 1))
            ;;
        esac
    done
done

TrevorBenson commented 1 year ago

@scality-fno @fra-scality any additional suggestions before we widen the pool of reviewers?

scality-fno commented 1 year ago

@scality-fno @fra-scality any additional suggestions before we widen the pool of reviewers?

absolutely widen it as soon as possible — I'm no reliable proofreader right now.

TrevorBenson commented 1 year ago

haven't been able to test this one... do you have a lab handy with everything in place from this "dev" environment? That said, gave a few suggestions (pointless, essentially). Question: is the double tee pipeline used for logging somewhat redundant?

Sort of, but not exactly. There are items that are only in the RID logs because they are not specific to a single bucket. THe RID log however does contain every log entry from each bucket. Below is the breakdown of log messages for a bucket log, and then the additional log messages only found in the RID log.

TL;DR

The <bucket>.log contains
- Everything for its own bucket, including:
- INFO about Starting/Completing export of bucket
- INFO (or) WARNING about verifyBucketSproxydKeys.js scan completion
- ERROR about failed to export keys or process keys
- INFO/WARNING about raw and processed count comparisons
- INFO about keep/remove of raw files
The PID_<PID>.log contains
- Everything inside each <bucket>.log
- Only present in the RID logs:
- INFO/ERROR about finding bucketd instance (valid IP etc.)
- INFO about the list of buckets per Raft ID
- ERROR about Raft ID empty (ENOENT / no buckets found)
- ERROR list of all unprocessed buckets whenever dumping the entire RID does not end successfully

TrevorBenson commented 11 months ago

Opening this up to a wider audience.

scality-fno commented 10 months ago

I haven't been able to sync up with @fra-scality (wr/ s3utils improvements) or Cédrick (wr/ latest spark toolkit use case scenario with EDF)... And even if this PR makes the TSKB about Spark usage obsolete, I don't care. These improvements are a must. My only concern is that some of them may have already been addressed "offline" by Francesco or Cédrick;

scality / spark

RING-44448 - export_s3_keys.sh improvements #49

TL;DR