databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 14 forks source link

runp fails when running the tutorial #277

Closed DavideBrex closed 6 months ago

DavideBrex commented 6 months ago

Hi,

I am using looper v1.7.0 and I am trying to run the extended tutorial of pepatac. The sample level step works fine. However, I get an error when I run: looper runp --looper-config examples/tutorial/.looper_tutorial_refgenie.yaml --package singularity

I get this error:

Detecting duplicate sample names ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

Traceback (most recent call last):
  File "/shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial//tools/pepatac/pipelines/pepatac_collator.py", line 172, in <module>
    sys.exit(main())
  File "/shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial//tools/pepatac/pipelines/pepatac_collator.py", line 99, in main
    yaml_dict['PEPATAC']['sample'][sample_name] = yaml_tmp['PEPATAC']['sample'][sample_name]
KeyError: 'tutorial2'

### Pipeline failed at:  (03-19 08:46:57) elapsed: 0.0 _TIME_

Total time: 0:00:00
Failure reason: Pipeline failure. See details above.
Exception ignored in atexit callback: <bound method PipelineManager._exit_handler of <pypiper.manager.PipelineManager object at 0x7ff0448d79d0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pypiper/manager.py", line 1799, in _exit_handler
    self.fail_pipeline(Exception("Pipeline failure. See details above."))
  File "/usr/local/lib/python3.10/dist-packages/pypiper/manager.py", line 1660, in fail_pipeline
    raise exc
Exception: Pipeline failure. See details above.

Looper finished
Jobs submitted: 1
{'Jobs submitted': 1}

Here is the full log file: PEPATAC_log.md Can you help me to understand what is happening? Thank you

Davide

DavideBrex commented 6 months ago

I noticed that the stats.yaml files for the two samples (tutorial1 and tutorial2) are empty:

PEPATAC:
  project: {}
  sample: {}

Could this be related to the error?

donaldcampbelljr commented 6 months ago

Yes, there must be sample-level data available in their respective stats.yaml files for the project-level pipeline to run properly. The attached log file shows that it cannot find the sample, tutorial2:

  File "/shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial//tools/pepatac/pipelines/pepatac_collator.py", line 99, in main
    yaml_dict['PEPATAC']['sample'][sample_name] = yaml_tmp['PEPATAC']['sample'][sample_name]
KeyError: 'tutorial2'
nsheff commented 6 months ago

So: are sure the sample-level pipelines are completing successfully? Can you attach those logs here? Because the sample-level pipelines should be what is populating those stats.yaml files with results.

DavideBrex commented 6 months ago

Hi, thank you for the quick reply!

Yes, sample-level processing seems fine. Here are the log files: PEPATAC_tutorial1.log PEPATAC_tutorial2.log

donaldcampbelljr commented 6 months ago

I don't see pipestat instantiation in your output log (which is what would report results to the stats.yaml file). Could you confirm that you are running the newest versions of Pepatac (v0.11.2) and update its associated requirements?

DavideBrex commented 6 months ago

The Pepatac version I downloaded is v0.11.2. I am using singularity to run it, so I assumed that I did not need to install all the requirements since everything would be already within the container. I added to compute_config.yaml the singularity compute package:

adapters:
  CODE: looper.command
  JOBNAME: looper.job_name
  CORES: compute.cores
  LOGFILE: looper.log_file
  TIME: compute.time
  MEM: compute.mem

compute_packages:
  default:
    submission_template: templates/localhost_template.sub
    submission_command: sh
  singularity:
    submission_template: templates/localhost_singularity_template.sub
    submission_command: sh
    singularity_args: -B /shares,/home

And then I add the --package singularity when I run the looper commands. Am I doing this wrong? Thank you

DavideBrex commented 6 months ago

Hi, I switched to Conda, and now the stats.yaml file is actually filled with values in the looper run step.

However, I get this error when I run looper runp examples/test_project/test_config.yaml

Using default schema: /shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial/tools/pepatac/pipelines/pipestat_output_sch
ema.yaml                                                                                                                                              

Traceback (most recent call last):                                                                                                                    
  File "/shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial/tools/pepatac/pipelines/pepatac_collator.py", line 177, i
n <module>                                                                                                                                            
    sys.exit(main())                                                                                                                                  
  File "/shares/CIBIO-Storage/scratch_CIBIO/sharedLC/DavideBressan/ATAC-seq/pepatac_tutorial/tools/pepatac/pipelines/pepatac_collator.py", line 82, in
 main                                                                                                                                                 
    project = peppy.Project(args.config_file)                                                                                                         
  File "/home/davide.bressan-1/mambaforge/envs/pepatac/lib/python3.9/site-packages/peppy/project.py", line 120, in __init__                           
    is_cfg = is_cfg_or_anno(cfg)                                                                                                                      
  File "/home/davide.bressan-1/mambaforge/envs/pepatac/lib/python3.9/site-packages/peppy/utils.py", line 177, in is_cfg_or_anno                       
    raise ValueError(                                                                                                                                 
ValueError: File path 'None' does not point to an annotation or config. Accepted extensions: {'config': ('.yaml', '.yml'), 'annotation': ('.csv', '.ts
v')}                                                                                                                                                  

### Pipeline failed at:  (03-20 13:10:31) elapsed: 0.0 _TIME_             

So I created a looper config file, and I changed the command to: looper runp --looper-config examples/test_project/.looper_test_refgenie.yaml

And now it works. I think this should be changed in the documentation (if it is not a problem only on my machine). Thank you for your support.

Davide