Closed carbocation closed 3 years ago
Can you double-check the task, perhaps pull up the underlying API operation?
Unless I am misunderstanding the error, that it is trying ["docker" "pull" "bash"]
indicates that it is trying to pull the bash image from Dockerhub.
Note that I just saw this same problem for the first time earlier this week for a workflow run by Cromwell under Terra. In that case, it dockerhub was throttling the connections to pull down the ubuntu:latest
image.
So, a bit interesting. Here is the full output from gcloud alpha lifesciences operations describe 4679186047174961621
. It strikes me as... odd that it seems to try (and fail) to connect to registry-1.docker.io
when all of the images are from gcr.io, not dockerhub (at least the ones that are explicitly mentioned in this output below).
gcloud alpha lifesciences operations describe 4679186047174961621
done: true
error:
code: 14
message: 'Execution failed: generic::unavailable: wrapping host binaries: pulling
image: retry budget exhausted (10 attempts): running ["docker" "pull" "bash"]:
exit status 1 (standard error: "Error response from daemon: Get https://registry-1.docker.io/v2/:
net/http: request canceled while waiting for connection (Client.Timeout exceeded
while awaiting headers)\n")'
metadata:
'@type': type.googleapis.com/google.cloud.lifesciences.v2beta.Metadata
createTime: '2021-03-31T19:13:53.704845Z'
endTime: '2021-03-31T19:30:38.561799866Z'
events:
- description: Worker released
timestamp: '2021-03-31T19:30:38.561799866Z'
workerReleased:
instance: google-pipelines-worker-157f9eda0bbba01e17a66d3fd85b15fb
zone: us-central1-f
- description: 'Execution failed: generic::unavailable: wrapping host binaries:
pulling image: retry budget exhausted (10 attempts): running ["docker" "pull"
"bash"]: exit status 1 (standard error: "Error response from daemon: Get https://registry-1.docker.io/v2/:
net/http: request canceled while waiting for connection (Client.Timeout exceeded
while awaiting headers)\n")'
failed:
cause: 'Execution failed: generic::unavailable: wrapping host binaries: pulling
image: retry budget exhausted (10 attempts): running ["docker" "pull" "bash"]:
exit status 1 (standard error: "Error response from daemon: Get https://registry-1.docker.io/v2/:
net/http: request canceled while waiting for connection (Client.Timeout exceeded
while awaiting headers)\n")'
code: UNAVAILABLE
timestamp: '2021-03-31T19:30:37.949807010Z'
- description: Worker "google-pipelines-worker-157f9eda0bbba01e17a66d3fd85b15fb"
assigned in "us-central1-f" on a "custom-4-8192" machine
timestamp: '2021-03-31T19:14:03.234988572Z'
workerAssigned:
instance: google-pipelines-worker-157f9eda0bbba01e17a66d3fd85b15fb
machineType: custom-4-8192
zone: us-central1-f
labels:
dsub-version: v0-4-4
job-id: maf-filter--james--210331-151350-57
job-name: maf-filter
task-id: '6'
user-id: james
pipeline:
actions:
- commands:
- -c
- |
set -o errexit
set -o nounset
set -o pipefail
function get_datestamp() {
date "+%Y-%m-%d %H:%M:%S"
}
function log_info() {
echo "$(get_datestamp) INFO: $@"
}
function log_warning() {
1>&2 echo "$(get_datestamp) WARNING: $@"
}
function log_error() {
1>&2 echo "$(get_datestamp) ERROR: $@"
}
function gsutil_cp() {
local src="${1}"
local dst="${2}"
local content_type="${3}"
local user_project_name="${4}"
local headers=""
if [[ -n "${content_type}" ]]; then
headers="-h Content-Type:${content_type}"
fi
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
if gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
exit 1
}
function log_cp() {
local src="${1}"
local dst="${2}"
local tmp="${3}"
local check_src="${4}"
local user_project_name="${5}"
if [[ "${check_src}" == "true" ]] && [[ ! -e "${src}" ]]; then
return
fi
# Copy the log files to a local temporary location so that our "gsutil cp" is never
# executed on a file that is changing.
local tmp_path="${tmp}/$(basename ${src})"
cp "${src}" "${tmp_path}"
gsutil_cp "${tmp_path}" "${dst}" "text/plain" "${user_project_name}"
}
while [[ ! -d /google/logs/action/6 ]]; do
mkdir -p /tmp/continuous_logging_action
log_cp /google/logs/action/4/stdout "${STDOUT_PATH}" /tmp/continuous_logging_action "true" "${USER_PROJECT}" &
STDOUT_PID=$!
log_cp /google/logs/action/4/stderr "${STDERR_PATH}" /tmp/continuous_logging_action "true" "${USER_PROJECT}" &
STDERR_PID=$!
log_cp /google/logs/output "${LOGGING_PATH}" /tmp/continuous_logging_action "false" "${USER_PROJECT}" &
LOG_PID=$!
wait "${STDOUT_PID}"
wait "${STDERR_PID}"
wait "${LOG_PID}"
sleep 60s
done
containerName: logging
entrypoint: /bin/bash
environment:
LOGGING_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6.log
STDERR_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6-stderr.log
STDOUT_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6-stdout.log
USER_PROJECT: ''
imageUri: gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim
runInBackground: true
- commands:
- -c
- |+
#!/bin/bash
set -o errexit
set -o nounset
set -o pipefail
function get_datestamp() {
date "+%Y-%m-%d %H:%M:%S"
}
function log_info() {
echo "$(get_datestamp) INFO: $@"
}
function log_warning() {
1>&2 echo "$(get_datestamp) WARNING: $@"
}
function log_error() {
1>&2 echo "$(get_datestamp) ERROR: $@"
}
mkdir -m 777 -p "/mnt/data/script"
mkdir -m 777 -p "/mnt/data/tmp"
mkdir -m 777 -p "/mnt/data/workingdir"
echo "${_SCRIPT_REPR}" | python -c 'import ast
import sys
sys.stdout.write(ast.literal_eval(sys.stdin.read()))
' > "/mnt/data/script/command.sh"
chmod a+x "/mnt/data/script/command.sh"
for ((i=0; i < DIR_COUNT; i++)); do
DIR_VAR="DIR_${i}"
log_info "mkdir -m 777 -p "${!DIR_VAR}""
mkdir -m 777 -p "${!DIR_VAR}"
done
containerName: prepare
entrypoint: /bin/bash
environment:
DIR_0: /mnt/data/input/gs/bucket/path/prevalent_disease_af_or_nonnsr
DIR_1: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2
DIR_2: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2
DIR_3: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2
DIR_COUNT: '4'
_META_YAML_REPR: "\"create-time: 2021-03-31 15:13:50.570367-04:00\\ndsub-version:\
\ v0-4-4\\nenvs:\\n MAF: 1E-2\\n PVAL: '1'\\njob-id: maf-filter--james--210331-151350-57\\\
njob-name: maf-filter\\nlogging: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/\\\
nscript-name: command.sh\\ntask-ids: 1-6\\ntasks:\\n- inputs:\\n file:\
\ gs://bucket/path/prevalent_disease_af_or_nonnsr/bolt.imputed.tsv.gz\\\
n logging-path: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6.log\\\
n outputs:\\n count_file: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.n_snps\\\
n filtered: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz\\\
n filtered_tbi: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz.tbi\\\
n task-id: '6'\\nuser-id: james\\n\""
_SCRIPT_REPR: "\"# Must be run on an image that has bgzip and tabix installed\\\
n\\n# Screen out erroneous SE and ChiSq values ($12>0 && $15>0)\\n# And\
\ require that MAF and P be within your specified range\\ngunzip -c ${file}\
\ | awk -v maf=${MAF} -v pval=${PVAL} -F $'\\\\t' 'NR == 1{print $0}; NR\
\ > 1 && $12>0 && $15>0 && ($16 <= pval) && (($7 > maf) && (1-$7 > maf))\
\ {print $0}' > output.file\\n\\n# For X, skip the header and append to\
\ what we already produced\\n# Disabled for now since not produced by REGENIE\\\
n# gunzip -c ${file_x} | awk -v maf=${MAF} -v pval=${PVAL} -F $'\\\\t' 'NR\
\ > 1 && $12>0 && $15>0 && ($16 <= pval) && (($7 > maf) && (1-$7 > maf))\
\ {print $0}' >> output.file\\n\\n# Count the SNPs\\necho $(( $(wc -l <\
\ output.file) - 1)) > ${count_file}\\n\\n# BGZip\\nbgzip -c output.file\
\ > ${filtered}\\n\\n# Tabix. This is BOLT-specific formatting\\ntabix \\\
\\\\n -s 2 \\\\\\n -b 3 \\\\\\n -e 3 \\\\\\n -S 1 \\\\\\n \
\ ${filtered}\\n\""
imageUri: gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim
mounts:
- disk: datadisk
path: /mnt/data
- commands:
- -c
- |+
function get_datestamp() {
date "+%Y-%m-%d %H:%M:%S"
}
function log_info() {
echo "$(get_datestamp) INFO: $@"
}
function log_warning() {
1>&2 echo "$(get_datestamp) WARNING: $@"
}
function log_error() {
1>&2 echo "$(get_datestamp) ERROR: $@"
}
function gsutil_rsync() {
local src="${1}"
local dst="${2}"
local user_project_name="${3}"
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
if gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
exit 1
}
function gsutil_cp() {
local src="${1}"
local dst="${2}"
local content_type="${3}"
local user_project_name="${4}"
local headers=""
if [[ -n "${content_type}" ]]; then
headers="-h Content-Type:${content_type}"
fi
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
if gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
exit 1
}
function log_cp() {
local src="${1}"
local dst="${2}"
local tmp="${3}"
local check_src="${4}"
local user_project_name="${5}"
if [[ "${check_src}" == "true" ]] && [[ ! -e "${src}" ]]; then
return
fi
# Copy the log files to a local temporary location so that our "gsutil cp" is never
# executed on a file that is changing.
local tmp_path="${tmp}/$(basename ${src})"
cp "${src}" "${tmp_path}"
gsutil_cp "${tmp_path}" "${dst}" "text/plain" "${user_project_name}"
}
set -o errexit
set -o nounset
set -o pipefail
for ((i=0; i < INPUT_COUNT; i++)); do
INPUT_VAR="INPUT_${i}"
INPUT_RECURSIVE="INPUT_RECURSIVE_${i}"
INPUT_SRC="INPUT_SRC_${i}"
INPUT_DST="INPUT_DST_${i}"
log_info "Localizing ${!INPUT_VAR}"
if [[ "${!INPUT_RECURSIVE}" -eq "1" ]]; then
gsutil_rsync "${!INPUT_SRC}" "${!INPUT_DST}" "${USER_PROJECT}"
else
gsutil_cp "${!INPUT_SRC}" "${!INPUT_DST}" "" "${USER_PROJECT}"
fi
done
containerName: localization
entrypoint: /bin/bash
environment:
INPUT_0: file
INPUT_COUNT: '1'
INPUT_DST_0: /mnt/data/input/gs/bucket/path/prevalent_disease_af_or_nonnsr/bolt.imputed.tsv.gz
INPUT_RECURSIVE_0: '0'
INPUT_SRC_0: gs://bucket/path/prevalent_disease_af_or_nonnsr/bolt.imputed.tsv.gz
USER_PROJECT: ''
imageUri: gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim
mounts:
- disk: datadisk
path: /mnt/data
- commands:
- bash
- -c
- |
export TMPDIR="/mnt/data/tmp"
cd /mnt/data/workingdir
"/mnt/data/script/command.sh"
containerName: user-command
entrypoint: /usr/bin/env
environment:
MAF: '1E-2'
PVAL: '1'
count_file: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.n_snps
file: /mnt/data/input/gs/bucket/path/prevalent_disease_af_or_nonnsr/bolt.imputed.tsv.gz
filtered: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz
filtered_tbi: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz.tbi
imageUri: gcr.io/ukbb-analyses/ubuntu:2020-12-22
mounts:
- disk: datadisk
path: /mnt/data
- commands:
- -c
- |+
function get_datestamp() {
date "+%Y-%m-%d %H:%M:%S"
}
function log_info() {
echo "$(get_datestamp) INFO: $@"
}
function log_warning() {
1>&2 echo "$(get_datestamp) WARNING: $@"
}
function log_error() {
1>&2 echo "$(get_datestamp) ERROR: $@"
}
function gsutil_rsync() {
local src="${1}"
local dst="${2}"
local user_project_name="${3}"
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
if gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}""
exit 1
}
function gsutil_cp() {
local src="${1}"
local dst="${2}"
local content_type="${3}"
local user_project_name="${4}"
local headers=""
if [[ -n "${content_type}" ]]; then
headers="-h Content-Type:${content_type}"
fi
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
if gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
exit 1
}
function log_cp() {
local src="${1}"
local dst="${2}"
local tmp="${3}"
local check_src="${4}"
local user_project_name="${5}"
if [[ "${check_src}" == "true" ]] && [[ ! -e "${src}" ]]; then
return
fi
# Copy the log files to a local temporary location so that our "gsutil cp" is never
# executed on a file that is changing.
local tmp_path="${tmp}/$(basename ${src})"
cp "${src}" "${tmp_path}"
gsutil_cp "${tmp_path}" "${dst}" "text/plain" "${user_project_name}"
}
set -o errexit
set -o nounset
set -o pipefail
for ((i=0; i < OUTPUT_COUNT; i++)); do
OUTPUT_VAR="OUTPUT_${i}"
OUTPUT_RECURSIVE="OUTPUT_RECURSIVE_${i}"
OUTPUT_SRC="OUTPUT_SRC_${i}"
OUTPUT_DST="OUTPUT_DST_${i}"
log_info "Delocalizing ${!OUTPUT_VAR}"
if [[ "${!OUTPUT_RECURSIVE}" -eq "1" ]]; then
gsutil_rsync "${!OUTPUT_SRC}" "${!OUTPUT_DST}" "${USER_PROJECT}"
else
gsutil_cp "${!OUTPUT_SRC}" "${!OUTPUT_DST}" "" "${USER_PROJECT}"
fi
done
containerName: delocalization
entrypoint: /bin/bash
environment:
OUTPUT_0: filtered
OUTPUT_1: filtered_tbi
OUTPUT_2: count_file
OUTPUT_COUNT: '3'
OUTPUT_DST_0: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz
OUTPUT_DST_1: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz.tbi
OUTPUT_DST_2: gs://bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.n_snps
OUTPUT_RECURSIVE_0: '0'
OUTPUT_RECURSIVE_1: '0'
OUTPUT_RECURSIVE_2: '0'
OUTPUT_SRC_0: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz
OUTPUT_SRC_1: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.tsv.gz.tbi
OUTPUT_SRC_2: /mnt/data/output/gs/bucket/path/maf-filter/p-1-maf-1E-2/outputprevalent_disease_af_or_nonnsr.n_snps
USER_PROJECT: ''
imageUri: gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim
mounts:
- disk: datadisk
path: /mnt/data
- alwaysRun: true
commands:
- -c
- |+
set -o errexit
set -o nounset
set -o pipefail
function get_datestamp() {
date "+%Y-%m-%d %H:%M:%S"
}
function log_info() {
echo "$(get_datestamp) INFO: $@"
}
function log_warning() {
1>&2 echo "$(get_datestamp) WARNING: $@"
}
function log_error() {
1>&2 echo "$(get_datestamp) ERROR: $@"
}
function gsutil_cp() {
local src="${1}"
local dst="${2}"
local content_type="${3}"
local user_project_name="${4}"
local headers=""
if [[ -n "${content_type}" ]]; then
headers="-h Content-Type:${content_type}"
fi
local user_project_flag=""
if [[ -n "${user_project_name}" ]]; then
user_project_flag="-u ${user_project_name}"
fi
local attempt
for ((attempt = 0; attempt < 4; attempt++)); do
log_info "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
if gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}"; then
return
fi
if (( attempt < 3 )); then
log_warning "Sleeping 10s before the next attempt of failed gsutil command"
log_warning "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
sleep 10s
fi
done
log_error "gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}""
exit 1
}
function log_cp() {
local src="${1}"
local dst="${2}"
local tmp="${3}"
local check_src="${4}"
local user_project_name="${5}"
if [[ "${check_src}" == "true" ]] && [[ ! -e "${src}" ]]; then
return
fi
# Copy the log files to a local temporary location so that our "gsutil cp" is never
# executed on a file that is changing.
local tmp_path="${tmp}/$(basename ${src})"
cp "${src}" "${tmp_path}"
gsutil_cp "${tmp_path}" "${dst}" "text/plain" "${user_project_name}"
}
mkdir -p /tmp/logging_action
log_cp /google/logs/action/4/stdout "${STDOUT_PATH}" /tmp/logging_action "true" "${USER_PROJECT}" &
STDOUT_PID=$!
log_cp /google/logs/action/4/stderr "${STDERR_PATH}" /tmp/logging_action "true" "${USER_PROJECT}" &
STDERR_PID=$!
log_cp /google/logs/output "${LOGGING_PATH}" /tmp/logging_action "false" "${USER_PROJECT}" &
LOG_PID=$!
wait "${STDOUT_PID}"
wait "${STDERR_PID}"
wait "${LOG_PID}"
containerName: final_logging
entrypoint: /bin/bash
environment:
LOGGING_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6.log
STDERR_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6-stderr.log
STDOUT_PATH: gs://bucket/path/maf-filter/p-1-maf-1E-2/dsub-logs/maf-filter--james--210331-151350-57.6-stdout.log
USER_PROJECT: ''
imageUri: gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim
resources:
regions:
- us-central1
- us-east1
- us-west1
virtualMachine:
bootDiskSizeGb: 10
bootImage: projects/cloud-lifesciences/global/images/w20210322-0300-rc0060586ad3-0000-2431-9437-089e0828bff8stable
disks:
- name: datadisk
sizeGb: 50
type: pd-ssd
labels:
dsub-version: v0-4-4
goog-pipelines-worker: 'true'
job-id: maf-filter--james--210331-151350-57
job-name: maf-filter
task-id: '6'
user-id: james
machineType: custom-4-8192
network:
usePrivateAddress: true
nvidiaDriverVersion: 450.51.06
serviceAccount:
email: default
scopes:
- https://www.googleapis.com/auth/bigquery
- https://www.googleapis.com/auth/compute
- https://www.googleapis.com/auth/devstorage.full_control
- https://www.googleapis.com/auth/genomics
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring.write
- https://www.googleapis.com/auth/cloud-platform
volumes:
- persistentDisk:
sizeGb: 50
type: pd-ssd
volume: datadisk
timeout: 604800s
startTime: '2021-03-31T19:14:03.234988572Z'
name: projects/783282864357/locations/us-central1/operations/4679186047174961621
Hi James,
I agree. The error message does not make sense in the context of that operation. All actions use a gcr.io imageUri, either:
And yet it would seem that the Pipelines API is trying to pull the bash docker image for dockerhub. I have filed a bug with the HCLS team.
Thanks!
I heard back from the Pipelines team that they are aware of the problem and are looking to remove the dependency. No specific timelines.
Do you have a sense of frequency in which you hit this?
Hard to say though it seems to be becoming more frequent. For example, I'm hitting it now (well, not waiting for all the retries - but noting that machines seem to be stuck in state VM starting (awaiting worker checkin)
while trying to run a 'hello world' program on a 4-core machine). VM starting (awaiting worker checkin)
seems to be the state that precedes the retry failure.
For what it's worth, I am not seeing a difference between google-v2 and google-cls-v2 providers. I also don't see any errors on the GCP status page to indicate that there is some global problem. I've tried E2 and N1 machine types with similar results. Currently I don't seem to be able to launch jobs unless I remove the --use-private-address
flag, but a couple of days ago I was eventually able to. These aren't systematic observations, just anecdote, but sending them along in case they make something 'click' for someone else.
I am also consistently experiencing this issue running jobs with internal IPs, pulling from gcr.io
. I just launched a batch of 20 jobs and all failed with this same error. Thanks to @carbocation for raising the issue.
Latest message from the Pipelines team is that they've heard from a few people the very same issue, and are working to have it fixed as soon as possible.
From Pipelines team: A fix is in the process of rollout. us-central1 and europe-west2 are fixed, while rollouts to remaining locations are in progress.
We were hitting this issue thru cromwell in us-central1 and can confirm we haven't seen it since Google told us there was a fix April 9th.
Going to close this as the Google team has fixed the issue.
This is a new error message for me, and I checked the GCP status page in case it was transient but don't see any active issues.
I am launching a bunch of tasks across a bunch of zones (within a single job) using the google-cls-v2 API with dsub 0.4.3. I am getting the following error, but only on some tasks within this job (while others within the same job are launching and succeeding):
This strikes me as an odd message, because (1) I'm running these machines without a public IP, and so (2) I'm only using
gcr.io
Docker instances with them. I don't know why they'd be pointing to registry-1.docker.io.If this isn't a dsub issue, I can point this to the mailing list. (And if there is a quick way for me to get a flavor for "dsub issue vs not dsub issue", just let me know so I can self-triage.)
Thanks!