databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 15 forks source link

Difficulty integrating with refgenie or manually downloaded assets #207

Open gbloeb opened 3 years ago

gbloeb commented 3 years ago

Trying to run natively with refgenie and hg38 assets installed and getting this error:

 ~/pepatac/pipelines/pepatac.py \
> -G hg38 \
> -I /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R1_001.fastq.gz \
> -I2 /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R2_001.fastq.gz \
> -S HRCE1_L1_onlyNewinstall \
> -O /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk \
> -Q Paired 

pepatac.py: error: the following arguments are required: --genome-index, --chrom-sizes

Confirming that refgenie behaving as expected:

refgenie list
                                  Local refgenie assets                                  
                   Server subscriptions: http://refgenomes.databio.org                   
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ genome ┃ assets                                                                       ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ hg38   │ fasta, bowtie2_index, refgene_anno, ensembl_gtf, ensembl_rb, feat_annotation │
│ rCRSd  │ fasta, bowtie2_index                                                         │
└────────┴──────────────────────────────────────────────────────────────────────────────┘
                  use refgenie list -g <genome> for more detailed view                  

When I manually add paths to the genome-index and chrom sizes I also get an error, standard output below and attaching full log.

~/pepatac/pipelines/pepatac.py \
> -G hg38 \
> -I /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R1_001.fastq.gz \
> -I2 /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R2_001.fastq.gz \
> -S HRCE1_L1_onlyNewinstall \
> -O /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk \
> -Q Paired \
> --genome-index /wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/bowtie2_index/ \
> --chrom-sizes /wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/hg38.chrom.sizes/ \
> 
### Pipeline run code and environment:

*              Command:  `/wynton/home/reiter/gloeb/pepatac/pipelines/pepatac.py -G hg38 -I /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R1_001.fastq.gz -I2 /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R2_001.fastq.gz -S HRCE1_L1_onlyNewinstall -O /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk -Q Paired --genome-index /wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/bowtie2_index/ --chrom-sizes /wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/hg38.chrom.sizes/`
*         Compute host:  dev1.wynton.ucsf.edu
*          Working dir:  /wynton/group/reiter/gabe/bulk_atac/211020_primaryTubule_bulk
*            Outfolder:  /wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/HRCE1_L1_onlyNewinstall/
*  Pipeline started at:   (10-26 13:07:05) elapsed: 0.0 _TIME_

### Version log:

*       Python version:  3.7.4
*          Pypiper dir:  `/wynton/home/reiter/gloeb/.local/lib/python3.7/site-packages/pypiper`
*      Pypiper version:  0.12.1
*         Pipeline dir:  `/wynton/home/reiter/gloeb/pepatac/pipelines`
*     Pipeline version:  0.10.0
*        Pipeline hash:  06e7147fe61f043ded2a9039677c433fb23277fb
*      Pipeline branch:  * master
*        Pipeline date:  2021-09-28 16:03:05 -0400

### Arguments passed to pipeline:

*           `TSS_name`:  `None`
*            `aligner`:  `bowtie2`
*          `anno_name`:  `None`
*          `blacklist`:  `None`
*        `chrom_sizes`:  `/wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/hg38.chrom.sizes/`
*        `config_file`:  `pepatac.yaml`
*              `cores`:  `1`
*       `deduplicator`:  `samblaster`
*              `dirty`:  `False`
*             `extend`:  `250`
*       `force_follow`:  `False`
*     `frip_ref_peaks`:  `None`
*    `genome_assembly`:  `hg38`
*       `genome_index`:  `/wynton/home/reiter/gloeb/refgenie/data/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/bowtie2_index/`
*        `genome_size`:  `2.7e9`
*              `input`:  `['/wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R1_001.fastq.gz']`
*             `input2`:  `['/wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk/fastq/HRCE-1-M-57Y-A_S1_L001_R2_001.fastq.gz']`
*               `keep`:  `False`
*               `lite`:  `False`
*             `logdev`:  `False`
*                `mem`:  `4000`
*              `motif`:  `False`
*          `new_start`:  `False`
*            `no_fifo`:  `False`
*           `no_scale`:  `False`
*      `output_parent`:  `/wynton/home/reiter/gloeb/group/bulk_atac/211020_primaryTubule_bulk`
*         `paired_end`:  `True`
*        `peak_caller`:  `macs2`
*          `peak_type`:  `fixed`
* `prealignment_index`:  `[]`
* `prealignment_names`:  `[]`
*         `prioritize`:  `False`
*            `recover`:  `False`
*        `sample_name`:  `HRCE1_L1_onlyNewinstall`
*        `search_file`:  `None`
*             `silent`:  `False`
*   `single_or_paired`:  `Paired`
*             `skipqc`:  `False`
*                `sob`:  `False`
*           `testmode`:  `False`
*            `trimmer`:  `skewer`
*          `verbosity`:  `None`

----------------------------------------

Traceback (most recent call last):
  File "/wynton/home/reiter/gloeb/pepatac/pipelines/pepatac.py", line 2746, in <module>
    sys.exit(main())
  File "/wynton/home/reiter/gloeb/pepatac/pipelines/pepatac.py", line 671, in main
    message = "{}\t{}".format(asset, os.path.expandvars(res[asset]))
  File "/wynton/home/reiter/gloeb/miniconda3/lib/python3.7/posixpath.py", line 288, in expandvars
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

### Pipeline failed at:  (10-26 13:07:05) elapsed: 0.0 _TIME_

Total time: 0:00:01
Failure reason: Pipeline failure. See details above.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/wynton/home/reiter/gloeb/.local/lib/python3.7/site-packages/pypiper/manager.py", line 1799, in _exit_handler
    self.fail_pipeline(Exception("Pipeline failure. See details above."))
  File "/wynton/home/reiter/gloeb/.local/lib/python3.7/site-packages/pypiper/manager.py", line 1660, in fail_pipeline
    raise exc
Exception: Pipeline failure. See details above.

PEPATAC_log.txt

slarsen21 commented 3 years ago

I am experiencing the same problem with the same error message at line 671 with the mm10 genome pulled from refgenie.

louiseplots commented 3 years ago

Me too, running in docker and with mm10.