nf-core / chipseq

ChIP-seq peak-calling, QC and differential analysis pipeline.
https://nf-co.re/chipseq
MIT License
197 stars 150 forks source link

Error in ConsensusPeakSet process #134

Closed PabloLatorre closed 4 years ago

PabloLatorre commented 4 years ago

Hi there,

I am running the pipeline on some yeast ChIP-seq data and it is giving an error in the ConsensusPeakSet. I am running it in local and with singularity: nextflow run nf-core/chipseq --input design.csv --genome 'R64-1-1' -profile singularity -r 1.1.0 --save_reference --max_memory '30.GB' --max_cpus 5 -resume

The pipeline gives the following error:

Error executing process > 'ConsensusPeakSet'

Caused by:
  Process `ConsensusPeakSet` terminated with an error exit status (1)

Command executed:

  sort -k1,1 -k2,2n ab_0_R1_peaks.broadPeak ab_0_R2_peaks.broadPeak ab_0_R3_peaks.broadPeak ab_5_R1_peaks.broadPeak ab_5_R2_peaks.broadPeak ab_5_R3_peaks.broadPeak \
      | mergeBed -c 2,3,4,5,6,7,8,9 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > ab.consensus_peaks.txt

  macs2_merged_expand.py ab.consensus_peaks.txt \
      ab_0_R1,ab_0_R2,ab_0_R3,ab_5_R1,ab_5_R2,ab_5_R3 \
     ab.consensus_peaks.boolean.txt \
      --min_replicates 1 \

  awk -v FS='   ' -v OFS='  ' 'FNR > 1 { print $1, $2, $3, $4, "0", "+" }' ab.consensus_peaks.boolean.txt > ab.consensus_peaks.bed

  echo -e "GeneIDprojectChrranscStarton_Endess_yStrand" > ab.consensus_peaks.saf
  awk -v FS='   ' -v OFS='  ' 'FNR > 1 { print $4, $1, $2, $3,  "+" }' ab.consensus_peaks.boolean.txt >> ab.consensus_peaks.saf

  plot_peak_intersect.r -i ab.consensus_peaks.boolean.intersect.txt -o ab.consensus_peaks.boolean.intersect.plot.pdf

  find * -type f -name "ab.consensus_peaks.bed" -exec echo -e "bwa/mergedLibrary/macs/broadPeak/consensus/ab/"{}"\t0,0,0" \; > ab.consensus_peaks.bed.igv.txt

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: Not mounting requested bind point (already mounted in container): /home/platorre
  Error: package or namespace load failed for ‘UpSetR’:
   .onLoad failed in loadNamespace() for 'pillar', details:
    call: loadNamespace(name)
    error: there is no package called ‘crayon’
  Execution halted

Any idea about why is this behaviour happening?

Thanks,

Pablo

drpatelh commented 4 years ago

Hi @PabloLatorre . It looks like you have a clash with the R installed in the Singularity container and a version you have installed locally. Does ~/.Rprofile exist? If so, you can you try and rename it to something else temporarily and re-run the pipeline?

Unfortunately, Im not sure there is a way to force R to only look in the container for packages/libs. With Python this has been working: https://github.com/nf-core/chipseq/blob/21be3149542cdc84431e12d1e092359058aed32a/nextflow.config#L130

PabloLatorre commented 4 years ago

Hi @drpatelh, thanks for your quick reply. I don't have ~/.Rprofile. I tried using docker and the pipeline was completed successfully. I might need to run the pipeline in a cluster in the future with singularity. Let's see if there is no clash with R there.

Thanks,

Pablo

drpatelh commented 4 years ago

Great! No worries. Please feel free to send a message if you have any further problems.

Also, it may be worth joining the #chipseq channel on the nf-core Slack workspace: https://nf-co.re/join

rufusmorgan commented 3 years ago

Hi, I also have the same error. I am running on a cluster, with Conda (as cluster only work with Conda).

nextflow run nf-core/chipseq -r 1.1.0 --input design.csv --genome mm10 -profile conda --single_end

This is the error I receive:

Error executing process > 'ConsensusPeakSet (anti-V5)'

Caused by:
  Process `ConsensusPeakSet (anti-V5)` terminated with an error exit status (1)

Command executed:

  sort -k1,1 -k2,2n CDX_R1_peaks.broadPeak CDX_R2_peaks.broadPeak \
      | mergeBed -c 2,3,4,5,6,7,8,9 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > anti-V5.consensus_peaks.txt

  macs2_merged_expand.py anti-V5.consensus_peaks.txt \
      CDX_R1,CDX_R2 \
      anti-V5.consensus_peaks.boolean.txt \
      --min_replicates 1 \

  awk -v FS='   ' -v OFS='  ' 'FNR > 1 { print $1, $2, $3, $4, "0", "+" }' anti-V5.consensus_peaks.boolean.txt > anti-V5.consensus_peaks.bed

  echo -e "GeneID   Chr Start   End Strand" > anti-V5.consensus_peaks.saf
  awk -v FS='   ' -v OFS='  ' 'FNR > 1 { print $4, $1, $2, $3,  "+" }' anti-V5.consensus_peaks.boolean.txt >> anti-V5.consensus_peaks.saf

  plot_peak_intersect.r -i anti-V5.consensus_peaks.boolean.intersect.txt -o anti-V5.consensus_peaks.boolean.intersect.plot.pdf

  find * -type f -name "anti-V5.consensus_peaks.bed" -exec echo -e "bwa/mergedLibrary/macs/broadPeak/consensus/anti-V5/"{}"\t0,0,0" \; > anti-V5.consensus_peaks.bed.igv.txt

Command exit status:
  1

Command output:
  (empty)

Command error:
  Error in rowSums(Freqs[, 1:num_sets]) : 
    'x' must be an array of at least two dimensions
  Calls: upset -> Counter -> [ -> [.data.frame -> rowSums
  Execution halted
rufusmorgan commented 3 years ago

To add, I have already tried ~/.Rprofile which I dont have.

drpatelh commented 3 years ago

Hi @rufusmorgan ! It could be a bug in the plot_peak_intersect.r script. If you are familiar with R it would be great if you can debug this a little more. You can copy the files required for the plot_peak_intersect.r script somewhere and test it out within your Conda environment to try and identify the issue.

rufusmorgan commented 3 years ago

Thank you for the fast reply @drpatelh, unfortunately I am not massively familiar with R. I dont think I would be able to debug the script. Sorry!

arteteco commented 2 years ago

I'm facing the same issue, was this solved? I do not have any ~/.Rprofile and I'm using docker as a profile

versions and parameters

N E X T F L O W  ~  version 21.10.6  
Launching `nf-core/chipseq` [medip8] - revision: 0f487ed76d [1.2.1]
----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/chipseq v1.2.1
----------------------------------------------------

Run Name            : medip8
Data Type           : Single-End
Design File         : design.csv
GenomeFile          : Not supplied   nomes/Mmusculus/mm10/GENCODE/NCBIM37.genome.faation.gtf Size   : mm
Min Consensus Reps  : 1
MACS2 Narrow Peaks  : Yes
Trim R1             : 3 bp
Trim R2             : 3 bp
Trim 3' R1          : 3 bp
Trim 3' R2          : 3 bp
NextSeq Trim        : 0 bp
Fragment Size       : 50 bp
Fingerprint Bins    : 500000
Save Genome Index   : Yes

Trim 3' R2          : 3 bp
NextSeq Trim        : 0 bp
Fragment Size       : 50 bp
Fingerprint Bins    : 500000
Save Genome Index   : Yes
Use DESeq2 vst Transform: Yes
Max Resources       : 256.GB memory, 80 cpus, 10d time per job
Container           : docker - nfcore/chipseq:1.2.1
Output Dir          : ./results
Launch Dir          : /DATA/SCRATCH/manuel/pezone/meDIPSeq
Working Dir         : /DATA/SCRATCH/manuel/pezone/meDIPSeq/work
Script Dir          : /home/manuel/.nextflow/assets/nf-core/chipseq
User                : manuel
Config Profile      : docker

Full error

Pipeline completed with errors-
Error executing process > 'CONSENSUS_PEAKS (STHdhQ111)'

Caused by:
  Process `CONSENSUS_PEAKS (STHdhQ111)` terminated with an error exit status (1)

Command executed:
  sort -T '.' -k1,1 -k2,2n STHdhQ111_R1_peaks.narrowPeak STHdhQ111_R2_peaks.narr wPeak \ollapse,collapse,collapse,collapse > STHdhQ111.consensus_pe

  macs2_merged_expand.py STHdhQ111.consensus_peaks.txt \
      STHdhQ111_R1,STHdhQ111_R2 \
      STHdhQ111.consensus_peaks.boolean.txt \
      --min_replicates 1 \
      --is_narrow_peak
  awk -v FS='   ' -v OFS='      ' 'FNR > 1 { print $1, $2, $3, $4, "0", "+" }' S HdhQ111.consensus_peaks.boolean.txt > STHdhQ111.consensus_peaks.be
  plot_peak_intersect.r -i STHdhQ111.consensus_peaks.boolean.intersect.txt -o ST dhQ111.consensus_peaks.boolean.intersect.plot.pdf

  find * -type f -name "STHdhQ111.consensus_peaks.bed" -exec echo -e "bwa/merged ibrary/macs/narrowPeak/consensus/STHdhQ111/"{}"\t0,0,0" \; > STHdh            hQ111.consensus_peaks.bed.igv.txt

Command exit status:
  1

Command output:
  find * -type f -name "STHdhQ111.consensus_peaks.bed" -exec echo -e "bwa/merged ibrary/macs/narrowPeak/consensus/STHdhQ111/"{}"\t0,0,0" \; > STHdh            hQ111.consensus_peaks.bed.igv.txt

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  Warning message:
  package ‘optparse’ was built under R version 3.6.3 
  Warning message:
  package ‘UpSetR’ was built under R version 3.6.3 
  Error in rowSums(Freqs[, 1:num_sets]) : 
    'x' must be an array of at least two dimensions
  Calls: upset -> Counter -> [ -> [.data.frame -> rowSums
  Execution halted

Work dir:
  /DATA/SCRATCH/manuel/pezone/meDIPSeq/work/e0/102370a0eb5b140900aea022b16cc6

Tip: you can replicate the issue by changing to the process work dir and enterinT the command `bash .command.run`

This is my nf-params.json

{
    "input": "design.csv",
    "single_end": true,
    "fragment_size": 50,
    "fasta": "\/DATA\/PUBLIC\/Genomes\/Mmusculus\/mm10\/GENCODE\/NCBIM37.genome.fa",
    "gtf": "\/DATA\/PUBLIC\/Genomes\/Mmusculus\/mm10\/GENCODE\/gencode.vM1.annotation.gtf",
    "bwa_index": "\/DATA\/PUBLIC\/Genomes\/Mmusculus\/mm10\/GENCODE\/BWA\/NCBIM37.genome.fa",
    "macs_gsize": "mm",
    "save_reference": true,
    "clip_r1": 3,
    "clip_r2": 3,
    "three_prime_clip_r1": 3,
    "three_prime_clip_r2": 3,
    "narrow_peak": true,
    "deseq2_vst": true,
    "max_cpus": 80,
    "max_memory": "256.GB",
    "name": "pezone 0.1"
}

And this is the full command issued

nextflow run nf-core/chipseq -r 1.2.1 -name medip -resume -profile docker -params-file nf-params.json
julianeedham commented 1 year ago

Hi I am also having this issue running on singluarity any help? It seems to only be a problem when I include the min_reps_consensus parameter.

Core Nextflow options revision : master runName : thirsty_sanger containerEngine : singularity launchDir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts workDir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts/work projectDir : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq userName : jn720 profile : standard configFiles : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq/nextflow.config, /apps/nextflow/22.04.4/bin/nextflow.config, /rds/general/user/jn720/home/configs/cx1.config

Input/output options input : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/data/rawdata/JNCdx2ChipJuly23_nextflow_template.csv read_length : 75 outdir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23_R17R18test/data/test/nextflow

Reference genome options genome : mm10 fasta : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa gtf : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf bwa_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/BWAIndex/version0.6.0/ bowtie2_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/ star_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/STARIndex/ macs_gsize : 2406655830 blacklist : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq/assets/blacklists/v2.0/mm10-blacklist.v2.bed

Peak calling options narrow_peak : true min_reps_consensus : 2

Institutional config options config_profile_description: Imperial College London - HPC config_profile_contact : George Young (bioinformatics@lms.mrc.ac.uk) config_profile_url : https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/

Max job request options max_cpus : 40 max_memory : 480 GB max_time : 24d 20h 31m 24s

Error executing process > 'NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS (CDX2)'

Caused by: Process NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS (CDX2) terminated with an error exit status (1)

Command executed:

sort -T '.' -k1,1 -k2,2n D3.5_NMP_R17_CDX2_peaks.narrowPeak D3.5_NMP_R18_CDX2_peaks.narrowPeak D3.5_NMP_R19_CDX2_peaks.narrowPeak D3_NMP_R17_CDX2_peaks.narrowPeak D3_NMP_R18_CDX2_peaks.narrowPeak D3_NMP_R19_CDX2_peaks.narrowPeak D4_SC_R17_CDX2_peaks.narrowPeak D4_SC_R18_CDX2_peaks.narrowPeak D4_SC_R19_CDX2_peaks.narrowPeak \ | mergeBed -c 2,3,4,5,6,7,8,9,10 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > CDX2.consensus_peaks.txt

macs2_merged_expand.py \ CDX2.consensus_peaks.txt \ D3.5_NMP_R17_CDX2,D3.5_NMP_R18_CDX2,D3.5_NMP_R19_CDX2,D3_NMP_R17_CDX2,D3_NMP_R18_CDX2,D3_NMP_R19_CDX2,D4_SC_R17_CDX2,D4_SC_R18_CDX2,D4_SC_R19_CDX2 \ CDX2.consensus_peaks.boolean.txt \ --min_replicates 2 \ --is_narrow_peak

awk -v FS=' ' -v OFS=' ' 'FNR > 1 { print $1, $2, $3, $4, "0", "+" }' CDX2.consensus_peaks.boolean.txt > CDX2.consensus_peaks.bed

echo -e "GeneID Chr Start End Strand" > CDX2.consensus_peaks.saf awk -v FS=' ' -v OFS=' ' 'FNR > 1 { print $4, $1, $2, $3, "+" }' CDX2.consensus_peaks.boolean.txt >> CDX2.consensus_peaks.saf

plot_peak_intersect.r -i CDX2.consensus_peaks.boolean.intersect.txt -o CDX2.consensus_peaks.boolean.intersect.plot.pdf

echo "CDX2.consensus_peaks.bed CDX2/CDX2.consensus_peaks.bed" > CDX2.consensus_peaks.antibody.txt

cat <<-END_VERSIONS > versions.yml "NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS": python: $(python --version | sed 's/Python //g') r-base: $(echo $(R --version 2>&1) | sed 's/^.R version //; s/ .$//') END_VERSIONS

Command exit status: 1

Command output: (empty)

Command error: Error in read.table(opt$input_file, sep = "\t", header = FALSE) : no lines available in input Execution halted

Work dir: /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts/work/b1/9921f40298d773e3c1dc74ed40d476

tulsi92 commented 1 week ago

Hi I am also having this issue running on singluarity any help? It seems to only be a problem when I include the min_reps_consensus parameter.

Core Nextflow options revision : master runName : thirsty_sanger containerEngine : singularity launchDir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts workDir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts/work projectDir : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq userName : jn720 profile : standard configFiles : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq/nextflow.config, /apps/nextflow/22.04.4/bin/nextflow.config, /rds/general/user/jn720/home/configs/cx1.config

Input/output options input : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/data/rawdata/JNCdx2ChipJuly23_nextflow_template.csv read_length : 75 outdir : /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23_R17R18test/data/test/nextflow

Reference genome options genome : mm10 fasta : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa gtf : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Annotation/Genes/genes.gtf bwa_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/BWAIndex/version0.6.0/ bowtie2_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/ star_index : s3://ngi-igenomes/igenomes/Mus_musculus/UCSC/mm10/Sequence/STARIndex/ macs_gsize : 2406655830 blacklist : /rds/general/user/jn720/home/.nextflow/assets/nf-core/chipseq/assets/blacklists/v2.0/mm10-blacklist.v2.bed

Peak calling options narrow_peak : true min_reps_consensus : 2

Institutional config options config_profile_description: Imperial College London - HPC config_profile_contact : George Young (bioinformatics@lms.mrc.ac.uk) config_profile_url : https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/

Max job request options max_cpus : 40 max_memory : 480 GB max_time : 24d 20h 31m 24s

Error executing process > 'NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS (CDX2)'

Caused by: Process NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS (CDX2) terminated with an error exit status (1)

Command executed:

sort -T '.' -k1,1 -k2,2n D3.5_NMP_R17_CDX2_peaks.narrowPeak D3.5_NMP_R18_CDX2_peaks.narrowPeak D3.5_NMP_R19_CDX2_peaks.narrowPeak D3_NMP_R17_CDX2_peaks.narrowPeak D3_NMP_R18_CDX2_peaks.narrowPeak D3_NMP_R19_CDX2_peaks.narrowPeak D4_SC_R17_CDX2_peaks.narrowPeak D4_SC_R18_CDX2_peaks.narrowPeak D4_SC_R19_CDX2_peaks.narrowPeak | mergeBed -c 2,3,4,5,6,7,8,9,10 -o collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse,collapse > CDX2.consensus_peaks.txt

macs2_merged_expand.py CDX2.consensus_peaks.txt D3.5_NMP_R17_CDX2,D3.5_NMP_R18_CDX2,D3.5_NMP_R19_CDX2,D3_NMP_R17_CDX2,D3_NMP_R18_CDX2,D3_NMP_R19_CDX2,D4_SC_R17_CDX2,D4_SC_R18_CDX2,D4_SC_R19_CDX2 CDX2.consensus_peaks.boolean.txt --min_replicates 2 --is_narrow_peak

awk -v FS=' ' -v OFS=' ' 'FNR > 1 { print $1, $2, $3, $4, "0", "+" }' CDX2.consensus_peaks.boolean.txt > CDX2.consensus_peaks.bed

echo -e "GeneID Chr Start End Strand" > CDX2.consensus_peaks.saf awk -v FS=' ' -v OFS=' ' 'FNR > 1 { print $4, $1, $2, $3, "+" }' CDX2.consensus_peaks.boolean.txt >> CDX2.consensus_peaks.saf

plot_peak_intersect.r -i CDX2.consensus_peaks.boolean.intersect.txt -o CDX2.consensus_peaks.boolean.intersect.plot.pdf

echo "CDX2.consensus_peaks.bed CDX2/CDX2.consensus_peaks.bed" > CDX2.consensus_peaks.antibody.txt

cat <<-END_VERSIONS > versions.yml "NFCORE_CHIPSEQ:CHIPSEQ:MACS2_CONSENSUS": python: $(python --version | sed 's/Python //g') r-base: ( e c h o (R --version 2>&1) | sed 's/^.R version //; s/ .$//') END_VERSIONS

Command exit status: 1

Command output: (empty)

Command error: Error in read.table(opt$input_file, sep = "\t", header = FALSE) : no lines available in input Execution halted

Work dir: /rds/general/user/jn720/home/WORK/projects/JNCdx2ChipJuly23/scripts/work/b1/9921f40298d773e3c1dc74ed40d476

Were you ever able to solve this issue? I am getting the exact same error when using the min_reps_consensus parameter