Xinglab / IRIS

IRIS: Isoform peptides from RNA splicing for Immunotherapy target Screening
Other
24 stars 9 forks source link

screening error when using individual mode #22

Open infWang opened 7 months ago

infWang commented 7 months ago

Thank you very much for your patient and detailed explanations earlier. Recently, when I was using IRIS in individual mode with a single sample, the following error occurred:

/IRIS/IRIS/conda_wrapper /IRIS/IRIS/conda_env_2 IRIS screen --parameter-fin results/docker_test/screen.para --splicing-event-type SE --outdir results/docker_test/screen --translating --gtf references/gencode.v26lift37.annotation.gtf 1> results/docker_test/iris_screen_log.out 2> results/docker_test/iris_screen_log.err
[Mon Mar 11 03:37:12 2024]
Error in rule iris_screen:
    jobid: 7
    output: results/docker_test/screen/docker_test.SE.test.all_guided.txt, results/docker_test/screen/docker_test.SE.test.all_voted.txt, results/docker_test/screen/docker_test.SE.notest.txt, results/docker_test/screen/docker_test.SE.tier1.txt, results/docker_test/screen/docker_test.SE.tier2tier3.txt
    log: results/docker_test/iris_screen_log.out, results/docker_test/iris_screen_log.err (check log file(s) for error message)
    shell:
        /IRIS/IRIS/conda_wrapper /IRIS/IRIS/conda_env_2 IRIS screen --parameter-fin results/docker_test/screen.para --splicing-event-type SE --outdir results/docker_test/screen --translating --gtf references/gencode.v26lift37.annotation.gtf 1> results/docker_test/iris_screen_log.out 2> results/docker_test/iris_screen_log.err
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

and here is the log:

[Ended] no test performed because no testable events. Check input or filtering parameteres.

additionally, here is my "snakemake_config.yaml" file:

# Resource allocation
create_star_index_threads: 200
create_star_index_mem_gb: 140
create_star_index_time_hr: 12
iris_append_sjc_mem_gb: 180
iris_append_sjc_time_hr: 24
# TODO 16 threads hardcoded in iris process_rnaseq
iris_cuff_task_threads: 200
iris_cuff_task_mem_gb: 180
iris_cuff_task_time_hr: 12
iris_epitope_post_mem_gb: 180
iris_epitope_post_time_hr: 12
iris_exp_matrix_mem_gb: 180
iris_exp_matrix_time_hr: 12
iris_extract_sjc_task_mem_gb: 180
iris_extract_sjc_task_time_hr: 12
iris_format_mem_gb: 180
iris_format_time_hr: 12
# TODO seq2HLA defaults to 6 threads since IRIS does not supply the -p argument
iris_hla_task_threads: 200
iris_hla_task_mem_gb: 180
iris_hla_task_time_hr: 12
iris_parse_hla_mem_gb: 180
iris_parse_hla_time_hr: 12
iris_predict_mem_gb: 180
iris_predict_time_hr: 12
iris_predict_task_mem_gb: 180
iris_predict_task_time_hr: 12
# TODO 8 hardcoded in makesubsh_rmats
iris_rmats_task_threads: 200
iris_rmats_task_mem_gb: 180
iris_rmats_task_time_hr: 12
# TODO 8 hardcoded in makesubsh_rmatspost
iris_rmatspost_task_threads: 200
iris_rmatspost_task_mem_gb: 180
iris_rmatspost_task_time_hr: 12
iris_screen_mem_gb: 180
iris_screen_time_hr: 12
iris_screen_sjc_mem_gb: 180
iris_screen_sjc_time_hr: 12
iris_sjc_matrix_mem_gb: 180
iris_sjc_matrix_time_hr: 12
# TODO 6 threads hardcoded in iris process_rnaseq
iris_star_task_threads: 200
iris_star_task_mem_gb: 200
iris_star_task_time_hr: 12
iris_visual_summary_mem_gb: 180
iris_visual_summary_time_hr: 12
# Command options
run_core_modules: false
# run_all_modules toggles which rules can be run by
# conditionally adding UNSATISFIABLE_INPUT to certain rules.
run_all_modules: true
should_run_sjc_steps: true
star_sjdb_overhang: 100
run_name: 'docker_test'  # used to name output files
splice_event_type: 'SE'  # one of [SE, RI,A3SS, A5SS]
comparison_mode: 'individual'  # group or individual
stat_test_type: 'parametric'  # parametric or nonparametric
use_ratio: false
tissue_matched_normal_psi_p_value_cutoff: ''
tissue_matched_normal_sjc_p_value_cutoff: ''
tissue_matched_normal_delta_psi_p_value_cutoff: ''
tissue_matched_normal_fold_change_cutoff: ''
tissue_matched_normal_group_count_cutoff: ''
tissue_matched_normal_reference_group_names: ''
tumor_psi_p_value_cutoff: ''
tumor_sjc_p_value_cutoff: ''
tumor_delta_psi_p_value_cutoff: ''
tumor_fold_change_cutoff: ''
tumor_group_count_cutoff: ''
tumor_reference_group_names: ''
normal_psi_p_value_cutoff: '0.01'
normal_sjc_p_value_cutoff: '0.000001'
normal_delta_psi_p_value_cutoff: '0.05'
normal_fold_change_cutoff: '1'
normal_group_count_cutoff: '8'
normal_reference_group_names: 'GTEx_Heart,GTEx_Blood,GTEx_Lung,GTEx_Liver,GTEx_Brain,GTEx_Nerve,GTEx_Muscle,GTEx_Spleen,GTEx_Thyroid,GTEx_Skin,GTEx_Kidney'
# Input files
# sample_fastqs are not needed when just running the core modules
sample_fastqs:
    DN2222153:
     - '/IRIS/inputs/T001332989/SD221201094FTT_01_R1.fq'
     - '/IRIS/inputs/T001332989/SD221201094FTT_01_R2.fq'
#   sample_name_2:
#     - '/path/to/sample_2_read_1.fq'
#     - '/path/to/sample_2_read_2.fq'

blocklist: ''
####---------------------------------- Do not need to change the following arguments ----------------------------------####
mapability_bigwig: '/IRIS/IRIS_data/resources/mappability/wgEncodeCrgMapabilityAlign24mer.bigWig'
# mhc_list: '/path/to/example/hla_types_test.list'
# mhc_by_sample: '/path/to/example/hla_patient_test.tsv'
gene_exp_matrix: ''
#splice_matrix_txt: '/path/to/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt'
#splice_matrix_idx: '/path/to/example/splicing_matrix/splicing_matrix.SE.cov10.NEPC_example.txt.idx'
#sjc_count_txt: '/path/to/example/sjc_matrix/SJ_count.NEPC_example.txt'
#sjc_count_idx: '/path/to/example/sjc_matrix/SJ_count.NEPC_example.txt.idx'
# Reference files
gtf_name: 'gencode.v26lift37.annotation.gtf'
fasta_name: 'ucsc.hg19.fasta'
reference_files:
  gencode.v26lift37.annotation.gtf.gz:
    url: 'ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_26/GRCh37_mapping/gencode.v26lift37.annotation.gtf.gz'
  ucsc.hg19.fasta.gz:
    url: 'http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz'
# Additional configuration
rmats_path: '/IRIS/IRIS/conda_env_2/bin/rmats.py'  # should be written by ./install
conda_wrapper: '/IRIS/IRIS/conda_wrapper'  # should be written by ./install
conda_env_2: '/IRIS/IRIS/conda_env_2'  # should be written by ./install
conda_env_3: '/IRIS/IRIS/conda_env_3'  # should be written by ./install
iris_data: '/IRIS/IRIS_data'  # should be written by ./install
iedb_path: '/IRIS/IRIS/IEDB/mhc_i/src'  # should be written by ./install
rmats_path: '/IRIS/IRIS/conda_env_2/bin/rmats.py'

I would greatly appreciate it if you could provide any suggestions. If possible, I would also like a "snakemake_config.yaml" file template for the individual mode.

Looking forward to your reply. Thanks again.

EricKutschera commented 7 months ago

The check in the screening step is for 'group' or 'personalized', but the script that writes the parameter file uses 'group' or 'individual' which matches with the comment in the snakemake config. Everything should be updated to either personalized or individual: https://github.com/Xinglab/IRIS/blob/v2.0.1/IRIS/IRIS_screening.py#L370 https://github.com/Xinglab/IRIS/blob/v2.0.1/scripts/write_param_file.py#L54

I manually edited screen.para to use 'personalized' and was able to get the error message. Here's the line for that log message: https://github.com/Xinglab/IRIS/blob/master/IRIS/IRIS_screening.py#L494

It's checking for output in a particular file, but which file it checks seems to depend on whether any tissue_matched_normal groups are configured. A few lines earlier there is an individual_test section with a TODO comment that writes to the file that's checked with tissue_matched_normal: https://github.com/Xinglab/IRIS/blob/master/IRIS/IRIS_screening.py#L472

You could try running with a value in tissue_matched_normal_reference_group_names and the other tissue_matched_normal config values filled in

infWang commented 7 months ago

I tried to fill in one value in tissue_matched_normal_reference_group_names and other tissue_matched_normal configuration values, but encountered the following prompt.

rule iris_predict:
    input: results/docker_test/screen.para, results/docker_test/screen/docker_test.SE.test.all_guided.txt, results/docker_test/hla_typing/hla_types.list, results/docker_test/exp_matrix/exp.merged_matrix.docker_test.txt
    output: results/docker_test/screen/docker_test.SE.tier1.txt.ExtraCellularAS.txt, results/docker_test/screen/docker_test.SE.tier2tier3.txt.ExtraCellularAS.txt
    log: results/docker_test/iris_predict_log.out, results/docker_test/iris_predict_log.err
    jobid: 3
    reason: Missing output files: results/docker_test/screen/docker_test.SE.tier1.txt.ExtraCellularAS.txt, results/docker_test/screen/docker_test.SE.tier2tier3.txt.ExtraCellularAS.txt; Input files updated by another job: results/docker_test/screen/docker_test.SE.test.all_guided.txt
    resources: tmpdir=/tmp, mem_mb=184320, time_hours=12

if [[ -n "$(ls results/docker_test/predict_tasks/pep2epitope_SE.tier*.*.sh)" ]]; then rm results/docker_test/predict_tasks/pep2epitope_SE.tier*.*.sh; fi; /IRIS/IRIS/conda_wrapper /IRIS/IRIS/conda_env_2 IRIS predict results/docker_test/screen --task-dir results/docker_test/predict_tasks --parameter-fin results/docker_test/screen.para --splicing-event-type SE --iedb-local /IRIS/IRIS/IEDB/mhc_i/src --mhc-list results/docker_test/hla_typing/hla_types.list  --gene-exp-matrix results/docker_test/exp_matrix/exp.merged_matrix.docker_test.txt 1> results/docker_test/iris_predict_log.out 2> results/docker_test/iris_predict_log.err
**ls: cannot access 'results/docker_test/predict_tasks/pep2epitope_SE.tier*.*.sh': No such file or directory**
Waiting at most 1800 seconds for missing files.
^CTerminating processes on user request, this might take some time.
workflow error
Complete log: .snakemake/log/2024-03-29T011058.832286.snakemake.log
(base) [root@0ce034d40801 IRIS]# cat results/docker_test/iris_predict_log.err
(base) [root@0ce034d40801 IRIS]# cat results/docker_test/iris_predict_log.out
[INFO] Total extracellular annotation loaded: 54990
[INFO] No tier1 comparisons (tissue-matched normal) found. Use tier2&tier3 only mode. 
[INFO] Total HLA types loaded: 6 . Total peptide splice junctions loaded: 0