ylab-hi / ScanNeo2

Snakemake-based computational workflow for neoantigen prediction from diverse sources
MIT License
12 stars 1 forks source link

Run ScanNeo2 without DNAseq #39

Closed quirze closed 1 month ago

quirze commented 1 month ago

Hello Richard,

Thanks a lot for creating this automated pipeline. I would like to try it with the aim of predicting neoantigen using solely RNAseq data. As I understood from the paper and the documentation, this should be possible.

Following the installation and running steps, I set up a conda environemt and modified the config.yaml file to include my samples:

# General settings
reference:
  release: 111
  nonchr: false
threads: 30
mapq: 30  # overall required mapping quality
basequal: 20  # overall required base quality 

data:
  name: XXX
  dnaseq:
    dna_tumor:
  rnaseq:
    XXX_Rep1: /path/to/file/XXX_Rep1.R1.fq.gz /path/to/file/XXX_Rep1.R2.fq.gz
    XXX_Rep2: /path/to/file/XXX_Rep2.R1.fq.gz /path/to/file/XXX_Rep2.R2.fq.gz
    XXX_Rep3: /path/to/file/XXX_Rep3.R1.fq.gz /path/to/file/XXX_Rep3.R2.fq.gz
    XXX_Rep4: /path/to/file/XXX_Rep4.R1.fq.gz /path/to/file/XXX_Rep4.R2.fq.gz
  normal:

However, leaving dnaseq: empty gives me an error when trying to call a dry-run with snakemake:

(scanneo2) quirze@login6:$ snakemake -np --cores 64 --local-cores 64 --software-deployment-method conda --rerun-incomplete --executor slurm --jobs 30
Traceback (most recent call last):
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/cli.py", line 1898, in args_to_api
    dag_api = workflow_api.dag(
              ^^^^^^^^^^^^^^^^^
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 326, in dag
    return DAGApi(
           ^^^^^^^
  File "<string>", line 6, in __init__
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 436, in __post_init__
    self.workflow_api._workflow.dag_settings = self.dag_settings
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/api.py", line 383, in _workflow
    workflow.include(
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
    exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
  File "/quirze/Projects/XXX/workflow/Snakefile", line 30, in <module>
  File "/quirze/miniforge3/envs/scanneo2/lib/python3.12/site-packages/snakemake/workflow.py", line 1382, in include
    exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
  File "/quirze/Projects/XXX/workflow/rules/common.smk", line 126, in <module>
    config['data'] = data_structure(config['data'])
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/quirze/Projects/XXX/workflow/rules/common.smk", line 8, in data_structure
    config['data']['dnaseq'], filetype, readtype  = handle_seqfiles(config['data']['dnaseq'])
                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/quirze/Projects/XXX/workflow/rules/common.smk", line 64, in handle_seqfiles
    return mod_seqdata, filetype[0], readtype[0]
                         ^^^^^^^^^^^^^^
IndexError: list index out of range

I've tried removing dnaseq: and dna_tumor:, but then it gives a KeyError: KeyError: 'dnaseq'.

I would appreciate it if you could explain how would you recommend running ScanNeo2 without DNAseq.

Thanks in advance for your help. Quirze

riasc commented 1 month ago

Hello, thanks for using the tool. Can you please remove the key within dnaseq. This should resolve the error. Thanks for reporting this. I will add some tests to capture this case.

# General settings
reference:
  release: 111
  nonchr: false
threads: 30
mapq: 30  # overall required mapping quality
basequal: 20  # overall required base quality 

data:
  name: XXX
  dnaseq:
  rnaseq:
    XXX_Rep1: /path/to/file/XXX_Rep1.R1.fq.gz /path/to/file/XXX_Rep1.R2.fq.gz
    XXX_Rep2: /path/to/file/XXX_Rep2.R1.fq.gz /path/to/file/XXX_Rep2.R2.fq.gz
    XXX_Rep3: /path/to/file/XXX_Rep3.R1.fq.gz /path/to/file/XXX_Rep3.R2.fq.gz
    XXX_Rep4: /path/to/file/XXX_Rep4.R1.fq.gz /path/to/file/XXX_Rep4.R2.fq.gz
  normal:

I

quirze commented 1 month ago

Hello. Sorry for the delay. Thanks for your response. I did as suggested and after solving a few other errors I could successfully run the workflow. Thanks again for creating such a thorough resource. Bests, Quirze

riasc commented 1 month ago

Thanks. Let me know if you have additional questions.