databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 15 forks source link

Sample-specific yaml file is not generated for the first sample if the submission folder doesn't exist #178

Closed kwcurrin closed 3 years ago

kwcurrin commented 3 years ago

Hello,

This is a minor issue, but when running a dry run with looper, the .yaml file for the first sample is not written if the "submission" folder doesn't already exist. I received the below message: Could not write sample data to: ./submission/.yaml. Directory does not exist

However, the .sub file for that sample is successfully written to the submission folder and all .yaml and .sub files for the subsequent samples are successfully written. If I make an empty "submission" folder before running looper, the .yaml file for the first sample is successfully written. So it appears that the command that writes .yaml files will not create the "submission" folder if it does not exist, but the command that writes the .sub files will create the folder.

Thanks,

Kevin

nsheff commented 3 years ago

@stolarczyk This could be a looper issue with the new sample yaml writing framework, can you look into it?

stolarczyk commented 3 years ago

@kwcurrin can you upload/paste the PEP you used so I can try to reproduce this issue?

stolarczyk commented 3 years ago

I wasn't able to reproduce this with a simple PEP + pipeline interface. Are you using the latest version of looper?


pre run

[mstolarczyk@MichalsMBP test](): ll           
total 24
drwxr-xr-x@ 5 mstolarczyk  staff   160B May  4 09:10 .
drwxr-xr-x@ 8 mstolarczyk  staff   256B Apr 28 16:14 ..
-rw-r--r--@ 1 mstolarczyk  staff   187B May  4 09:10 cfg.yml
-rw-r--r--@ 1 mstolarczyk  staff   218B May  4 09:10 sample_piface.yml
-rw-r--r--@ 1 mstolarczyk  staff    57B May  4 09:10 sample_table.csv

post run

[mstolarczyk@MichalsMBP test](): ll outputs/submission         
total 48
drwxr-xr-x  8 mstolarczyk  staff   256B May  4 09:10 .
drwxr-xr-x  3 mstolarczyk  staff    96B May  4 09:10 ..
-rw-r--r--  1 mstolarczyk  staff    74B May  4 09:10 sample1_sample.yaml
-rw-r--r--  1 mstolarczyk  staff    74B May  4 09:10 sample2_sample.yaml
-rw-r--r--  1 mstolarczyk  staff    74B May  4 09:10 sample3_sample.yaml
-rw-r--r--  1 mstolarczyk  staff   297B May  4 09:10 test_pipeline_sample1.sub
-rw-r--r--  1 mstolarczyk  staff   297B May  4 09:10 test_pipeline_sample2.sub
-rw-r--r--  1 mstolarczyk  staff   297B May  4 09:10 test_pipeline_sample3.sub

log + configs

[mstolarczyk@MichalsMBP test](): looper run cfg.yml -p local -d
Looper version: 1.3.0
Command: run
Activating compute package 'local'
## [1 of 3] sample: sample1; pipeline: test_pipeline
Calling pre-submit function: looper.write_sample_yaml
Writing script to /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample1.sub
Job script (n=1; 0.00Gb): /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample1.sub
Dry run, not submitted
## [2 of 3] sample: sample2; pipeline: test_pipeline
Calling pre-submit function: looper.write_sample_yaml
Writing script to /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample2.sub
Job script (n=1; 0.00Gb): /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample2.sub
Dry run, not submitted
## [3 of 3] sample: sample3; pipeline: test_pipeline
Calling pre-submit function: looper.write_sample_yaml
Writing script to /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample3.sub
Job script (n=1; 0.00Gb): /Users/mstolarczyk/Desktop/testing/looper/test/outputs/submission/test_pipeline_sample3.sub
Dry run, not submitted

Looper finished
Samples valid for job generation: 3 of 3
Commands submitted: 3 of 3
Jobs submitted: 3
Dry run. No jobs were actually submitted.
[mstolarczyk@MichalsMBP test](): c cfg.yml                     
pep_version: 2.0.0
sample_table: sample_table.csv
sample_modifiers:
  append:
    pipeline_interfaces: [sample_piface.yml]
looper:
  output_dir: $HOME/Desktop/testing/looper/test/outputs
[mstolarczyk@MichalsMBP test](): c sample_table.csv            
sample_name,attr1
sample1,val1
sample2,val2
sample3,val3
[mstolarczyk@MichalsMBP test](): c sample_piface.yml           
pipeline_name: test_pipeline
pipeline_type: sample
pre_submit:
  python_functions:
    - looper.write_sample_yaml
command_template: >
  echo '\nsample name: {sample.sample_name}\toutput directory: {looper.output_dir}'
kwcurrin commented 3 years ago

Apologies for the delay. This does appear to be a looper issue. I have looper v1.3.0 installed, but the script that sets up the environment variables for this project on our cluster changes the python and looper versions. So I was unaware that looper was switched to v1.2.0 for this project. I only get this issue when running a dry run with looper v1.2.0. I do not get the issue when running either a dry run or regular run with looper v1.3.0 or with a regular run for v1.2.0. So it looks like this was something looper fixed in its most recent update. I will close this issue.