DataBiosphere / dsub

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.
Apache License 2.0
265 stars 44 forks source link

NO_JOB eventhough nothing ran #188

Closed apampana closed 4 years ago

apampana commented 4 years ago

I am trying to submit a dsub job and i am not getting the output. I am getting no_job and i am sure the input and output had run before. Can someone help me wi


#!/usr/bin/python

PROJECT_PATH="xyz"

# There is a manual step: please create a tab-delimited phenotype file at
# $PROJECT_PATH/pheno.tsv . Output from this project will ultimately go to
# $PROJECT_PATH/output/* .

# Leave one chromosome out?
USE_LOCO="TRUE"
TASK_DEFINITION_FILE="xyz/task1.tsv"

MAX_PREEMPTION=6

HAIL_DOCKER_IMAGE="gcr.io/jhs-project-243319/hail_latest:latest"

# Enable exit on error
set -o errexit

# Create the mytasks.tsv from our template
gsutil cat ${TASK_DEFINITION_FILE} | sed -e "s%gs://%${PROJECT_PATH}/output%g" > my.tasks.tsv

# Check for errors when we can
echo "Checking to make sure that ${PROJECT_PATH}/pheno.tsv exists"
gsutil ls ${PROJECT_PATH}/jhs.protOI.batch123.ALL.tab

# Launch step 1 and block until completion
echo "Test"
dsub \
    --project jhs-project-243319 \
    --provider google-v2 \
    --use-private-address \
    --regions us-central1 us-east1 us-west1 \
    --disk-type local-ssd \
    --disk-size 375 \
    --min-cores 64 \
    --min-ram 64 \
    --image ${HAIL_DOCKER_IMAGE} \
    --retries 1 \
    --skip \
    --wait \
    --logging ${PROJECT_PATH}/dsub-logs \
    --input PHENO_FILE=${PROJECT_PATH}/jhs.protOI.batch123.ALL.tab \
    --input HAIL_PATH=${PROJECT_PATH}/topmed_6a_pass_2k_minDP10_sQC_vQC_AF01_jhsprot.mt \
    --output-recursive OUTPUT_PATH=${PROJECT_PATH}/logs \
    --env LOCO=${USE_LOCO} \
    --timeout '12w' \
    --name test3 \
    --script /home/akhil/anaconda3/lib/python3.7/site-packages/dsub/commands/phewas_jhs_lmm.py \```
apampana commented 4 years ago

Is there anything i can change to get this running. Thank you

wnojopra commented 4 years ago

Hi @apampana ,

I'm happy to help out, but I need a little bit more to understand what the issue is. What do you mean by 'no_job'?

After running dsub, are there any error messages? If so please paste them here.

If there are no error messages, there should be a dstat command given to you for to run. Can you please execute that command with --full and paste it here?

wnojopra commented 4 years ago

I also notice that you have #!/usr/bin/python at the top of your file. Should that be #!/usr/bin/sh or #!/usr/bin/bash instead?

apampana commented 4 years ago

I kept python thing because i want to run a python script. i also kept /usr/bin/ too just to be sure.

apampana commented 4 years ago

image

apampana commented 4 years ago

I am getting like this when i tried to run analysis without --full parameter

image

apampana commented 4 years ago

no dstat step too I think job is not even getting started.

wnojopra commented 4 years ago

Hi @apampana , you are using the --skip flag, which as the dsub output hints at, will skip running jobs if the output already exists. I would suggest either removing the --skip flag to run the job and overwrite the output, or delete the existing output.

This is explained in detail in the Job Control docs.

apampana commented 4 years ago

image its like this for sometime where the code is basic importing. Is there a way to check which step its running?

wnojopra commented 4 years ago

At this point the job is submitted. Because you have --wait and --retry enabled, the dsub process is now waiting for the job to complete and monitoring for failed tasks to retry.

To check the status, copy the dstat command given and run it.

apampana commented 4 years ago

image I am getting something like this when trying to run analysis with failure how to deal with it?

wnojopra commented 4 years ago

CommandException: No URLs matched is an error coming from gsutil indicating that the input file it is trying to copy over was not found. Make sure the input files in gs://jhs_data_topmed/... exist.

wnojopra commented 4 years ago

This is starting to veer a little bit off the original topic, so I've reached out to you directly if you need further support.