Workflow Errors - Githubissues

walidabualafia commented 1 year ago

Hi, Eric,

I'm running into a new workflow error, mostly with "Resource Usage" in the cluster_status.py file.

This is the error I am seeing:

WorkflowError:
Cluster status command python ./snakemake_profile/cluster_status.py --retry-status-interval-seconds 30,120,300 --resource-usage-dir ./snakemake_profile/job_resource_usage --resource-usage-min-interval 300 returned None but just a single line with one of running,success,failed is expected.

Is there anyway for me to disable "resource usage" altogether? I don't really need to see resource utilization, just need IRIS to run. Adapting from slurm to LSF was an endeavor, so I might find value in taking some optional functionality of the SnakeMake pipeline out.

Please let me know, Walid

EricKutschera commented 1 year ago

Hi Walid,

You should be able to disable "resource usage" by having try_extract_job_info_from_status_output return {'status': status, 'resource_usage': None}, None where status is one of 'running', 'success', or 'failed'. The 'resource_usage' part just ends up getting printed to a log file so it's not necessary for just running the pipeline

I'm not sure what the LSF status command output looks like, but hopefully your implementation of try_extract_job_info_from_status_output can be pretty short since it just needs to set the 'status' value

walidabualafia commented 1 year ago

Hi, Eric,

Got some success with commenting out the portion that runs cluster_status.py altogether from snakemake_profile/config.yaml. Got IRIS to run up to 91% of the example case, then errored out on iris_visual_summary. This is the error itself:

[Wed Jul 12 13:47:14 2023]
Finished job 12.
21 of 23 steps (91%) done
[Wed Jul 12 13:49:04 2023]
Error in rule iris_visual_summary:
    jobid: 11
    output: results/NEPC_test/visualization/summary.png
    log: results/NEPC_test/visualization/iris_visual_summary_log.out, results/NEPC_test/visualization/iris_visual_summary_log.err (check log file(s) for error message)
    shell:
       /shortened_link/IRIS/conda_wrapper /shortened_link/IRIS/conda_env_2 IRIS visual_summary --parameter-fin results/NEPC_test/screen.para --screening-out-dir results/NEPC_test/screen --out-file-name results/NEPC_test/visualization/summary.png --splicing-event-type SE 1> results/NEPC_test/visualization/iris_visual_summary_log.out 2> results/NEPC_test/visualization/iris_visual_summary_log.err
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
    cluster_jobid: 196755288

Error executing rule iris_visual_summary on cluster (jobid: 11, external: 196755288, jobscript: /shortened_link/iris/vendor/IRIS/.snakemake/tmp.ckrdl4jo/snakejob.iris_visual_summary.11.sh). For error details see the cluster log and the log files of the involved rule(s).

I checked the log.err file, but the only thing that I was able to find in it was:

no tier2tier3 events

Do you know why this might be happening?

Thanks for all your help! :)

Walid

EricKutschera commented 1 year ago

I'm not sure what snakemake does without having a cluster status script. Maybe snakemake writes some job status file(s) somewhere. Depending on what snakemake does you may need an implementation of cluster status for the workflow to run correctly

The visualization step will fail if it doesn't find any results to plot in the output files. If you're running the example data then it should have some results unless there was an error with a previous step or a previous step didn't finish. You could try checking the logs for the other steps to see if there was an error

walidabualafia commented 1 year ago

Seems like output was generated, and no errors. When I cat *.err inside NEPC_test, I get:

Conda version '22.9.0' loaded
Not necessary to setup conda init now
Conda version '22.9.0' loaded
Not necessary to setup conda init now
Conda version '22.9.0' loaded
Not necessary to setup conda init now
Conda version '22.9.0' loaded
Not necessary to setup conda init now
index file references/ucsc.hg19.fasta.fai not found, generating...
Conda version '22.9.0' loaded
Not necessary to setup conda init now
Conda version '22.9.0' loaded
Not necessary to setup conda init now

The Conda warning is on my end, as I have this Conda module being loaded in my .bashrc and it gets sourced every time a job is scheduled.

Regarding output, everything seems to be as it should. Similarly, when I run cat *.out, I get:

[INFO] Done loading index NEPC_test GTEx_Heart GTEx_Blood GTEx_Lung GTEx_Liver GTEx_Brain GTEx_Nerve GTEx_Muscle GTEx_Spleen GTEx_Thyroid GTEx_Skin GTEx_Kidney
[INFO] Retrieving inclusion junction info
Percent: [####################] 100% Done...
[INFO] Number of event in SJ count screen summary:3
[INFO] Annotating inclusion junction info to results/NEPC_test/screen/NEPC_test.SE.tier2tier3.txt
IRIS annotate_ijc -p results/NEPC_test/screen.para --splicing-event-type SE -e results/NEPC_test/screen/NEPC_test.SE.tier2tier3.txt -o results/NEPC_test/screen --inc-read-cov-cutoff 2 --event-read-cov-cutoff 10
[INFO] Loading inclusion junction info from results/NEPC_test/screen/NEPC_test.SE.tier2tier3.txt.ijc_info.txt
[INFO] Integrating SJC test result to results/NEPC_test/screen/NEPC_test.SE.tier1.txt
[INFO] Integrating SJC test result to results/NEPC_test/screen/NEPC_test.SE.tier2tier3.txt
[INFO] Integrating SJC test result to results/NEPC_test/screen/NEPC_test.SE.tier2tier3.txt.ExtraCellularAS.txt
[INFO] Integrating SJC test result to results/NEPC_test/screen/SE.tier2tier3/epitope_summary.peptide-based.txt
[INFO] Integrating SJC test result to results/NEPC_test/screen/SE.tier2tier3/epitope_summary.junction-based.txt
[INFO] No tier1 comparisons (tissue-matched normal) found. Use tier2tier3 only mode.
[INFO] Collecting binding predictions:SE.tier2tier3
Percent: [####################] 100% Done...%
[INFO] Loading kmers for checking canonical proteome: 9,10,11
[INFO] Total pool of splice junctions: 5
Percent: [####################] 100% Done...
[INFO] Total number of AS junction k mers from the analyzed sequencing data: 333
[INFO] Total extracellular annotation loaded: 54990
[INFO] No tier1 comparisons (tissue-matched normal) found. Use tier2&tier3 only mode.
[INFO] Total HLA types loaded: 31 . Total peptide splice junctions loaded: 5
[INFO] IRIS screen - started. Total input events: 10
Percent: [####################] 100% Done...9%
[INFO] IRIS screen - summarizing. Total events from last step: 10
Percent: [####################] 100% Done...
[INFO] Total exon-orf loaded 426794 26774 4116
[INFO] Total processed 0
[INFO] Total exon-orf loaded 426794 26774 4116
Percent: [####################] 100% Done...
[INFO] Total processed 5
[INFO] Working on translating: tier1
[INFO] Working on translating: tier2tier3
[INFO] IRIS screening - started. Total input events: 11
Percent: [####################] 100% Done...
Percent: [####################] 100% Done...
Percent: [####################] 100% Done...

I would attach the output files themselves as well, but they're quite large. Not sure where the issue might be.

EricKutschera commented 1 year ago

The .out and .err lines you posted for results/NEPC_test/ look fine to me. There should also be log files in results/NEPC_test/predict_tasks/. Can you check if there are any error messages there? I think the workflow won't realize if there is an error in those predict tasks and the workflow will just think there are no prediction results

walidabualafia commented 1 year ago

Eric,

You're right. Seems like a missing library (libncursesw.so.6). The error is below.

ImportError: libncursesw.so.6: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/shortened_path/IRIS/IEDB/mhc_i/src/predict_binding.py", line 7, in <module>
    from seqpredictor import MHCBindingPredictions
  File "/shortened_path/IRIS/IEDB/mhc_i/src/seqpredictor.py", line 8, in <module>
    from method.netmhcpan_3_0_executable import predict_many
  File "/shortened_path/IRIS/IEDB/mhc_i/method/netmhcpan_3_0_executable/__init__.py", line 2, in <module>
    from curses.ascii import isdigit
  File "/shortened_path/IRIS/conda_env_2/lib/python2.7/curses/__init__.py", line 15, in <module>
    from _curses import *

We're running RHEL7, and on first glance, seems like a system library to me. I'll look into what I can do to get it in the Conda env. Please let me know if you have a better approach.

Cheers!

walidabualafia commented 1 year ago

Eric,

Tried running it on our RHEL8 partition, and it works like a charm! Thanks for all your help.

[Thu Jul 13 13:56:41 2023]
Finished job 0.
24 of 24 steps (100%) done
Complete log: .snakemake/log/2023-07-13T134103.522138.snakemake.log
workflow success

Xinglab / IRIS

Workflow Errors #13