kundajelab / atac_dnase_pipelines

ATAC-seq and DNase-seq processing pipeline
BSD 3-Clause "New" or "Revised" License
160 stars 81 forks source link

ataqqc failed: 'filename'.trim_gc.txt does not exist #89

Closed jonasungerback closed 6 years ago

jonasungerback commented 6 years ago

Hi, first and foremost, thank you so much for this pipeline. Standardization is much needed in the ATAC-seq world. I am however encountering an issue in the qc-step that I am too much a beginner for to locate. I did install the pipeline with the --recursive option and as long as I am giving the -no_xcor option it starts and runs (it fails with it and I am not sure why) into the ataqc steps but here I am encountering the error in the title. The part of the error message looks like this:

I am also attaching the zipped report.

Thank you so much in advance! Best, Jonas

Num 52 ID task.ataqc.ataqc_rep1.line_109.id_113 Name ataqc rep1 Thread thread_226 PID 47197 Num 52 ID task.ataqc.ataqc_rep1.line_109.id_113 Name ataqc rep1 Thread thread_226 PID 47197 OK false Exit Code 1 Retries State ERROR Dep. ERROR Cpus Mem OK false Exit Code 1 Retries State ERROR Dep. ERROR Cpus Mem Start 2018-01-05 09:36:39 End 2018-01-05 09:36:39 Elapsed 00:00:00 Timeout 00:00:-1 Wall Timeout 100 days Start 2018-01-05 09:36:39 End 2018-01-05 09:36:39 Elapsed 00:00:00 Timeout 00:00:-1 Wall Timeout 100 days Input files /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.fastq.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.align.log /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.nodup.pbc.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.dup.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/signal/macs2/rep1/NALM6_index_1.trim.nodup.tn5.pf.pval.signal.bigwig /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/pseudo_reps/rep1/NALM6_ATAC_ENCODE_pipeline_rep1-pr.IDR0.1.filt.narrowPeak.gz Output files /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.html /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.txt Dependencies Input files /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.fastq.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.align.log /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.nodup.pbc.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.dup.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/signal/macs2/rep1/NALM6_index_1.trim.nodup.tn5.pf.pval.signal.bigwig /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/pseudo_reps/rep1/NALM6_ATAC_ENCODE_pipeline_rep1-pr.IDR0.1.filt.narrowPeak.gz Output files /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.html /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.txt Dependencies # SYS command. line 111 if [[ -f $(which conda) && $(conda env list \ grep bds_atac \ wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/.:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s) # SYS command. line 114 export _JAVA_OPTIONS="-Xms256M -Xmx20G -XX:ParallelGCThreads=1" # SYS command. line 116 if [ "${TMPDIR}" != "" ] && [ -d "${TMPDIR}" ]; then \ export _JAVA_OPTIONS="${_JAVA_OPTIONS} -Djava.io.tmpdir=${TMPDIR}"; \ fi # SYS command. line 119 cd /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1 # SYS command. line 122 if [ -f "$(which picard)" ]; then export PICARDROOT="$(dirname $(which picard))/../share/picard"*; fi # SYS command. line 124 /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/ataqc/run_ataqc.py \ --workdir /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1 \ --outdir /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1 \ --outprefix NALM6_index_1.trim \ --genome hg19 \ --chromsizes /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/hg19.chrom.sizes \ --ref /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/male.hg19.fa \ --tss /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/hg19_gencode_tss_unique.bed.gz \ --dnase /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_all_p10_ucsc.bed.gz \ --blacklist /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/wgEncodeDacMapabilityConsensusExcludable.bed.gz \ --prom /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_prom_p2.bed.gz \ --enh /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_enh_p2.bed.gz \ --pbc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.nodup.pbc.qc\ --fastq1 /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.fastq.gz \ --alignedbam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.bam \ --alignmentlog /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.align.log \ --coordsortbam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.bam \ --duplog /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.dup.qc \ --finalbam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.bam \ --finalbed /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ --bigwig /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/signal/macs2/rep1/NALM6_index_1.trim.nodup.tn5.pf.pval.signal.bigwig \ --peaks /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/pseudo_reps/rep1/NALM6_ATAC_ENCODE_pipeline_rep1-pr.IDR0.1.filt.narrowPeak.gz --naive_overlap_peaks /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/overlap/optimal_set/NALM6_ATAC_ENCODE_pipeline_ppr.naive_overlap.filt.narrowPeak.gz --idr_peaks /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/optimal_set/NALM6_ATAC_ENCODE_pipeline_ppr.IDR0.1.filt.narrowPeak.gz \ --reg2map /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/dnase_avgs_reg2map_p10_merged_named.pvals.gz --reg2map_bed /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_all_p10_ucsc.bed.gz --meta /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/eid_to_mnemonic.txt # SYS command. line 147 rm -f test.log test.png # SYS command. line 149 TASKTIME=$[$(date +%s)-${STARTTIME}]; echo "Task has finished (${TASKTIME} seconds)."; sleep 0 --------------------Stdout-------------------- bedtools sort -i /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_all_p10_ucsc.bed.gz \ bedtools merge -i stdin \ bedtools intersect -u -nonamecheck -a /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz -b stdin \ wc -l zcat -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ wc -l bedtools sort -i /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/wgEncodeDacMapabilityConsensusExcludable.bed.gz \ bedtools merge -i stdin \ bedtools intersect -u -nonamecheck -a /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz -b stdin \ wc -l zcat -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ wc -l bedtools sort -i /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_prom_p2.bed.gz \ bedtools merge -i stdin \ bedtools intersect -u -nonamecheck -a /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz -b stdin \ wc -l zcat -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ wc -l bedtools sort -i /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/bds_pipeline_genome_data/hg19/ataqc/reg2map_honeybadger2_dnase_enh_p2.bed.gz \ bedtools merge -i stdin \ bedtools intersect -u -nonamecheck -a /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz -b stdin \ wc -l zcat -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ wc -l bedtools sort -i /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/pseudo_reps/rep1/NALM6_ATAC_ENCODE_pipeline_rep1-pr.IDR0.1.filt.narrowPeak.gz \ bedtools merge -i stdin \ bedtools intersect -u -nonamecheck -a /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz -b stdin \ wc -l zcat -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz \ wc -l --------------------Stderr-------------------- parser = TextFileReader(filepath_or_buffer, kwds) File "/opt/anaconda3/envs/bds_atac/lib/python2.7/site-packages/pandas/io/parsers.py", line 818, in init self._make_engine(self.engine) File "/opt/anaconda3/envs/bds_atac/lib/python2.7/site-packages/pandas/io/parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/opt/anaconda3/envs/bds_atac/lib/python2.7/site-packages/pandas/io/parsers.py", line 1695, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source IOError: File /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_gc.txt does not exist

52 task.ataqc.ataqc_rep1.line_109.id_113 ataqc rep1 thread_226 47197 false 1

ERROR ERROR

2018-01-05 09:36:39 2018-01-05 09:36:39 00:00:00 00:00:-1 100 days /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.fastq.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.align.log /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.nodup.pbc.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim.dup.qc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.bam /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep1/NALM6_index_1.trim.nodup.tn5.tagAlign.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/signal/macs2/rep1/NALM6_index_1.trim.nodup.tn5.pf.pval.signal.bigwig /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/peak/macs2/idr/pseudo_reps/rep1/NALM6_ATAC_ENCODE_pipeline_rep1-pr.IDR0.1.filt.narrowPeak.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.html /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/qc/rep1/NALM6_index_1.trim_qc.txt atac.bds.20180104_113531_124.report.html.zip

jonasungerback commented 6 years ago

Maybe it is something wrong with my installation but I have re-installed it with dependencies but I am also getting an error when I am using the -no_dup_removal option whether I start with a fastq-file or a raw bam-file. Any idea how I can get it to run with the -no_dup_removal option activated.

My full command was:

atac.bds -species hg19 -enable_idr -auto_detect_adapter -no_dup_removal -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 file1.fastq.gz -fastq2 file2.fastq.gz -fastq3 file3.fastq.gz

`shuf: '/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz': end of file Task failed: Program & line : '/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules/postalign_bed.bds', line 152 Task Name : 'spr rep2' Task ID : 'atac.bds.20180109_114003_949_parallel_45/task.postalign_bed.spr_rep2.line_152.id_21' Task PID : '32873' Task hint : 'if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bio' Task resources : 'cpus: -1 mem: -1.0 B wall-timeout: 8640000' State : 'ERROR' Dependency state : 'ERROR' Retries available : '1' Input files : '[/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz]' Output files : '[/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/pseudo_reps/rep2/pr1/NALM6_index_3_trimmed.filt.tn5.pr1.tagAlign.gz, /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/pseudo_reps/rep2/pr2/NALM6_index_3_trimmed.filt.tn5.pr2.tagAlign.gz]' Script file : '/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/atac.bds.20180109_114003_949_parallel_45/task.postalign_bed.spr_rep2.line_152.id_21.sh' Exit status : '1' StdErr (10 lines) : shuf: '/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz': end of file

Fatal error: /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds, line 662, pos 4. Task/s failed`

leepc12 commented 6 years ago

Please run the following for more debugging info:

$ OUT_DIR=/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out
$ ls -l $OUT_DIR/align/rep*
$ ls -l $OUT_DIR/qc/rep*

Thanks,

Jin

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon Virus-free. www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Tue, Jan 9, 2018 at 5:49 AM, Jonas notifications@github.com wrote:

Maybe it is something wrong with my installation but I have re-installed it with dependencies but I am also getting an error when I am using the -no_dup_removal option whether I start with a fastq-file or a raw bam-file. Any idea how I can get it to run with the -no_dup_removal option activated.

My full command was:

atac.bds -species hg19 -enable_idr -auto_detect_adapter -no_dup_removal -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 file1.fastq.gz -fastq2 file2.fastq.gz -fastq3 file3.fastq.gz

`shuf: '/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz': end of file Task failed: Program & line : '/mnt/data/bioinfo_toolsand refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/ atac_dnase_pipelines/modules/postalign_bed.bds', line 152 Task Name : 'spr rep2' Task ID : 'atac.bds.20180109_114003_949_parallel45/task.postalign bed.spr_rep2.line_152.id_21' Task PID : '32873' Task hint : 'if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bio' Task resources : 'cpus: -1 mem: -1.0 B wall-timeout: 8640000' State : 'ERROR' Dependency state : 'ERROR' Retries available : '1' Input files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz]' Output files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out/align/pseudo_reps/rep2/pr1/NALM6_index3 trimmed.filt.tn5.pr1.tagAlign.gz, /mnt/data/common/jonas/ atacseq/NALM6_ATAC_ENCODEpipeline/out/align/pseudo reps/rep2/pr2/NALM6_index_3_trimmed.filt.tn5.pr2.tagAlign.gz]' Script file : '/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/atac.bds.20180109_114003_949_parallel_45/task. postalign_bed.spr_rep2.line_152.id_21.sh' Exit status : '1' StdErr (10 lines) : shuf: '/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out/align/rep2/NALM6_index_3_trimmed.filt.tn5.tagAlign.gz': end of file

Fatal error: /mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds, line 662, pos

  1. Task/s failed`

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/89#issuecomment-356288809, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_Hkcs6wOK56qURkDcI0PD_OWltSyks5tI26BgaJpZM4RUby9 .

jonasungerback commented 6 years ago

Thanks, now I have done that both for one command that fails due to the -no_dup_removal and one successful run when it is not there. Clear is that files are missing in the first place but I am not sure why:

FAILED DUE TO -no_dup_removal:

bds /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species hg19 -no_xcor -enable_idr -auto_detect_adapter -out_dir out_NALM6_no_dup_rem_se -no_dup_removal -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 NALM6_index_1.fastq.gz -fastq2 NALM6_index_3.fastq.gz -fastq3 NALM6_index_5.fastq.gz

ls -l $OUT_DIR/align/rep*

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//align/rep1: total 20G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 10 21:55 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 10 22:12 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 10 15:01 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 2.7G Jan 11 01:08 NALM6_index_1.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:08 NALM6_index_1.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:16 NALM6_index_1.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:18 NALM6_index_1.trim.filt.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//align/rep2: total 23G -rw-rw-r-- 1 jonas sigvardsson 16G Jan 10 22:37 NALM6_index_3.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.6M Jan 10 22:44 NALM6_index_3.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 6.7G Jan 10 15:19 NALM6_index_3.trim.fastq.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//align/rep3: total 24G -rw-rw-r-- 1 jonas sigvardsson 15G Jan 10 21:59 NALM6_index_5.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.5M Jan 10 22:18 NALM6_index_5.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 5.9G Jan 10 15:13 NALM6_index_5.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 3.2G Jan 11 01:06 NALM6_index_5.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:06 NALM6_index_5.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 177M Jan 11 01:15 NALM6_index_5.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 178M Jan 11 01:17 NALM6_index_5.trim.filt.tn5.tagAlign.gz

$ ls -l $OUT_DIR/qc/rep*

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep1: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:17 NALM6_index_1.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 224 Jan 10 20:46 NALM6_index_1.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:23 NALM6_index_1.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:01 NALM6_index_1.trim.read_length.txt

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep2: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:18 NALM6_index_3.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:59 NALM6_index_3.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:59 NALM6_index_3.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:20 NALM6_index_3.trim.read_length.txt

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep3: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:18 NALM6_index_5.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:35 NALM6_index_5.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:32 NALM6_index_5.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:13 NALM6_index_5.trim.read_length.txt

SUCCESSFUL COMMAND:

bds /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species hg19 -no_xcor -enable_idr -auto_detect_adapter -out_dir out_NALM6_standard_se -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 NALM6_index_1.fastq.gz -fastq2 NALM6_index_3.fastq.gz -fastq3 NALM6_index_5.fastq.gz

ls -l $OUT_DIR/align/rep* /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/align/rep1: total 18G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 10 22:02 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 10 22:17 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 10 15:01 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 1022M Jan 11 01:38 NALM6_index_1.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:38 NALM6_index_1.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 114M Jan 11 01:47 NALM6_index_1.trim.nodup.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 114M Jan 11 01:48 NALM6_index_1.trim.nodup.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/align/rep2: total 24G -rw-rw-r-- 1 jonas sigvardsson 16G Jan 10 22:33 NALM6_index_3.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.6M Jan 10 22:40 NALM6_index_3.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 6.7G Jan 10 15:20 NALM6_index_3.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 997M Jan 11 10:45 NALM6_index_3.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 10:45 NALM6_index_3.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 113M Jan 11 10:56 NALM6_index_3.trim.nodup.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 112M Jan 11 10:57 NALM6_index_3.trim.nodup.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/align/rep3: total 22G -rw-rw-r-- 1 jonas sigvardsson 15G Jan 10 21:54 NALM6_index_5.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.5M Jan 10 22:17 NALM6_index_5.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 5.9G Jan 10 15:14 NALM6_index_5.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 1.1G Jan 11 01:38 NALM6_index_5.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:38 NALM6_index_5.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 121M Jan 11 01:49 NALM6_index_5.trim.nodup.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 121M Jan 11 01:50 NALM6_index_5.trim.nodup.tn5.tagAlign.gz

ls -l $OUT_DIR/qc/rep* /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/qc/rep1: total 47M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_1.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 224 Jan 10 20:51 NALM6_index_1.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 01:32 NALM6_index_1.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.3M Jan 11 15:52 NALM6_index_1.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:28 NALM6_index_1.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 01:38 NALM6_index_1.trim.nodup.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 62 Jan 11 01:44 NALM6_index_1.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:28 NALM6_index_1.trim.picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 462K Jan 11 14:33 NALM6_index_1.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 47K Jan 11 14:33 NALM6_index_1.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:01 NALM6_index_1.trim.read_length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 16:10 NALM6_index_1.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_1.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_1.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_1.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 228K Jan 11 16:07 NALM6_index_1.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 658K Jan 11 16:10 NALM6_index_1.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 16:10 NALM6_index_1.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 16:06 NALM6_index_1.trim_tss-enrich.png

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/qc/rep2: total 48M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_3.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:56 NALM6_index_3.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 10:39 NALM6_index_3.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 11 17:17 NALM6_index_3.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:56 NALM6_index_3.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 10:45 NALM6_index_3.trim.nodup.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 61 Jan 11 10:54 NALM6_index_3.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:32 NALM6_index_3.trim.picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 459K Jan 11 15:28 NALM6_index_3.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 62K Jan 11 15:28 NALM6_index_3.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:21 NALM6_index_3.trim.read_length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 17:39 NALM6_index_3.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_3.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_3.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_3.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 225K Jan 11 17:36 NALM6_index_3.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 647K Jan 11 17:40 NALM6_index_3.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 17:40 NALM6_index_3.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 17:36 NALM6_index_3.trim_tss-enrich.png

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_standard_se/qc/rep3: total 48M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_5.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:30 NALM6_index_5.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 01:32 NALM6_index_5.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 11 16:44 NALM6_index_5.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:31 NALM6_index_5.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 01:39 NALM6_index_5.trim.nodup.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 62 Jan 11 01:46 NALM6_index_5.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:31 NALM6_index_5.trim.picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 460K Jan 11 15:06 NALM6_index_5.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 57K Jan 11 15:06 NALM6_index_5.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:14 NALM6_index_5.trim.read_length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 17:04 NALM6_index_5.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_5.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_5.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_5.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 227K Jan 11 17:01 NALM6_index_5.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 652K Jan 11 17:05 NALM6_index_5.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 17:05 NALM6_index_5.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 17:01 NALM6_index_5.trim_tss-enrich.png

leepc12 commented 6 years ago

I wanted to look at NALM6_index_3_trimmed.filt.tn5.tagAlign.gz' on your directory '/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_ pipeline/out/align/rep2/.

But that file doesn't seem to exist? Did you list files on a different sample?

Please give me full debugging information about one specific sample.

Jin

On Thu, Jan 11, 2018 at 11:36 PM, Jonas notifications@github.com wrote:

Thanks, now I have done that both for one command that fails due to the -no_dup_removal and one successful run when it is not there. Clear is that files are missing in the first place but I am not sure why:

FAILED DUE TO -no_dup_removal:

bds /mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species hg19 -no_xcor -enable_idr -auto_detect_adapter -out_dir out_NALM6_no_dup_rem_se -no_dup_removal -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 NALM6_index_1.fastq.gz -fastq2 NALM6_index_3.fastq.gz -fastq3 NALM6_index_5.fastq.gz

ls -l $OUT_DIR/align/rep*

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//align/rep1: total 20G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 10 21:55 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 10 22:12 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 10 15:01 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 2.7G Jan 11 01:08 NALM6_index_1.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:08 NALM6_index_1.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:16 NALM6_index_1.trim.filt. tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:18 NALM6_index_1.trim.filt.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//align/rep2: total 23G -rw-rw-r-- 1 jonas sigvardsson 16G Jan 10 22:37 NALM6_index_3.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.6M Jan 10 22:44 NALM6_index_3.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 6.7G Jan 10 15:19 NALM6_index_3.trim.fastq.gz

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//align/rep3: total 24G -rw-rw-r-- 1 jonas sigvardsson 15G Jan 10 21:59 NALM6_index_5.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.5M Jan 10 22:18 NALM6_index_5.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 5.9G Jan 10 15:13 NALM6_index_5.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 3.2G Jan 11 01:06 NALM6_index_5.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:06 NALM6_index_5.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 177M Jan 11 01:15 NALM6_index_5.trim.filt. tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 178M Jan 11 01:17 NALM6_index_5.trim.filt.tn5.tagAlign.gz

$ ls -l $OUT_DIR/qc/rep*

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//qc/rep1: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:17 NALM6_index_1.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 224 Jan 10 20:46 NALM6_index_1.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:23 NALM6_index_1.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:01 NALM6_index1.trim.read length.txt

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//qc/rep2: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:18 NALM6_index_3.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:59 NALM6_index_3.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:59 NALM6_index_3.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:20 NALM6_index3.trim.read length.txt

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se//qc/rep3: total 16K -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:18 NALM6_index_5.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:35 NALM6_index_5.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:32 NALM6_index_5.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:13 NALM6_index5.trim.read length.txt

SUCCESSFUL COMMAND:

bds /mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species hg19 -no_xcor -enable_idr -auto_detect_adapter -out_dir out_NALM6_standard_se -rm_chr_from_tag mito -multimapping 4 -nth 8 -se -fastq1 NALM6_index_1.fastq.gz -fastq2 NALM6_index_3.fastq.gz -fastq3 NALM6_index_5.fastq.gz

ls -l $OUT_DIR/align/rep* /mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/align/rep1: total 18G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 10 22:02 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 10 22:17 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 10 15:01 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 1022M Jan 11 01:38 NALM6_index_1.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:38 NALM6_index_1.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 114M Jan 11 01:47 NALM6_index_1.trim.nodup. tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 114M Jan 11 01:48 NALM6_index_1.trim.nodup.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/align/rep2: total 24G -rw-rw-r-- 1 jonas sigvardsson 16G Jan 10 22:33 NALM6_index_3.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.6M Jan 10 22:40 NALM6_index_3.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 6.7G Jan 10 15:20 NALM6_index_3.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 997M Jan 11 10:45 NALM6_index_3.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 10:45 NALM6_index_3.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 113M Jan 11 10:56 NALM6_index_3.trim.nodup. tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 112M Jan 11 10:57 NALM6_index_3.trim.nodup.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/align/rep3: total 22G -rw-rw-r-- 1 jonas sigvardsson 15G Jan 10 21:54 NALM6_index_5.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.5M Jan 10 22:17 NALM6_index_5.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 5.9G Jan 10 15:14 NALM6_index_5.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 1.1G Jan 11 01:38 NALM6_index_5.trim.nodup.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:38 NALM6_index_5.trim.nodup.bam.bai -rw-rw-r-- 1 jonas sigvardsson 121M Jan 11 01:49 NALM6_index_5.trim.nodup. tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 121M Jan 11 01:50 NALM6_index_5.trim.nodup.tn5.tagAlign.gz

ls -l $OUT_DIR/qc/rep* /mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/qc/rep1: total 47M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_1.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 224 Jan 10 20:51 NALM6_index_1.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 01:32 NALM6_index_1.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.3M Jan 11 15:52 NALM6_index_1.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:28 NALM6_index_1.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 01:38 NALM6_index_1.trim.nodup. flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 62 Jan 11 01:44 NALM6_index_1.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:28 NALM6_index_1.trim. picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 462K Jan 11 14:33 NALM6_index_1.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 47K Jan 11 14:33 NALM6_index_1.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:01 NALM6_index1.trim.read length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 16:10 NALM6_index_1.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_1.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_1.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_1.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 228K Jan 11 16:07 NALM6_index_1.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 658K Jan 11 16:10 NALM6_index_1.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 16:10 NALM6_index_1.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 16:06 NALM6_index_1.trim_tss-enrich.png

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/qc/rep2: total 48M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_3.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:56 NALM6_index_3.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 10:39 NALM6_index_3.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 11 17:17 NALM6_index_3.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:56 NALM6_index_3.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 10:45 NALM6_index_3.trim.nodup. flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 61 Jan 11 10:54 NALM6_index_3.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:32 NALM6_index_3.trim. picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 459K Jan 11 15:28 NALM6_index_3.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 62K Jan 11 15:28 NALM6_index_3.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:21 NALM6_index3.trim.read length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 17:39 NALM6_index_3.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_3.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_3.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_3.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 225K Jan 11 17:36 NALM6_index_3.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 647K Jan 11 17:40 NALM6_index_3.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 17:40 NALM6_index_3.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 17:36 NALM6_index_3.trim_tss-enrich.png

/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_standard_se/qc/rep3: total 48M -rw-rw-r-- 1 jonas sigvardsson 407 Jan 10 14:19 NALM6_index_5.adapter.txt -rw-rw-r-- 1 jonas sigvardsson 225 Jan 10 20:30 NALM6_index_5.trim.align.log -rw-rw-r-- 1 jonas sigvardsson 1.4K Jan 11 01:32 NALM6_index_5.trim.dup.qc -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 11 16:44 NALM6_index_5.trim.dupmark.ataqc.bam.bai -rw-rw-r-- 1 jonas sigvardsson 392 Jan 10 22:31 NALM6_index_5.trim.flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 383 Jan 11 01:39 NALM6_index_5.trim.nodup. flagstat.qc -rw-rw-r-- 1 jonas sigvardsson 62 Jan 11 01:46 NALM6_index_5.trim.nodup.pbc.qc -rw-rw-r-- 1 jonas sigvardsson 766 Jan 11 12:31 NALM6_index_5.trim. picardcomplexity.qc -rw-rw-r-- 1 jonas sigvardsson 460K Jan 11 15:06 NALM6_index_5.trim.preseq.dat -rw-rw-r-- 1 jonas sigvardsson 57K Jan 11 15:06 NALM6_index_5.trim.preseq.log -rw-rw-r-- 1 jonas sigvardsson 3 Jan 10 15:14 NALM6_index5.trim.read length.txt -rw-rw-r-- 1 jonas sigvardsson 44M Jan 11 17:04 NALM6_index_5.trim.signal -rw-rw-r-- 1 jonas sigvardsson 4.9K Jan 11 12:16 NALM6_index_5.trim_gc.txt -rw-rw-r-- 1 jonas sigvardsson 8.8K Jan 11 12:16 NALM6_index_5.trim_gcPlot.pdf -rw-rw-r-- 1 jonas sigvardsson 1.2K Jan 11 12:16 NALM6_index_5.trim_gcSummary.txt -rw-rw-r-- 1 jonas sigvardsson 227K Jan 11 17:01 NALM6_index_5.trim_large_tss-enrich.png -rw-rw-r-- 1 jonas sigvardsson 652K Jan 11 17:05 NALM6_index_5.trim_qc.html -rw-rw-r-- 1 jonas sigvardsson 1.8K Jan 11 17:05 NALM6_index_5.trim_qc.txt -rw-rw-r-- 1 jonas sigvardsson 43K Jan 11 17:01 NALM6_index_5.trim_tss-enrich.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/89#issuecomment-357164476, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_Ke94XJl3tCgXpDwE4pT_dlhumbUks5tJwt3gaJpZM4RUby9 .

jonasungerback commented 6 years ago

Thanks, I attach it here. Seems like deduplication fails even though I specifically gave the option not to deduplicate but there may be more going on that I do not get.

Jonas atac.bds.20180110_141743_834.report.html.gz

leepc12 commented 6 years ago

1) error in deduplication: sambamba-sort: Unable to write to stream This means that you don't have not enough memory or space in $TMP $TMP /tmp.

2) error in spr: again out of memory or disk space? please run the following for more debugging info.

$ ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1

$ free -h
$ df -h
$ df -h $TMP
$ df -h $TMPDIR
$ df -h /tmp
leepc12 commented 6 years ago

Pipeline needs at least >20G of memory. If you are working on a laptop/desktop, you will need to add parameter with parameter -nth 1 to disable parallelization and multi-threading.

jonasungerback commented 6 years ago

Yes I have understood that from previous threads and I ran into that problem myself but then Ubuntu issued a warning. This was however before I changed the tmpdir and the problem was discovered first after. I have however not set a tmp variable.

free -h total used free shared buff/cache available Mem: 251G 2.7G 41G 110M 207G 247G Swap: 10G 0B 10G

df -h Filesystem Size Used Avail Use% Mounted on udev 126G 0 126G 0% /dev tmpfs 26G 18M 26G 1% /run /dev/sda2 213G 39G 174G 19% / tmpfs 126G 46M 126G 1% /dev/shm tmpfs 5.0M 12K 5.0M 1% /run/lock tmpfs 126G 0 126G 0% /sys/fs/cgroup /dev/sda1 922M 487M 372M 57% /boot /dev/sdb2 7.3T 143G 7.2T 2% /home /dev/sdb1 22T 2.3T 20T 11% /mnt/data tmpfs 26G 40K 26G 1% /run/user/1002

df -h $TMP Filesystem Size Used Avail Use% Mounted on udev 126G 0 126G 0% /dev tmpfs 26G 18M 26G 1% /run /dev/sda2 213G 39G 174G 19% / tmpfs 126G 49M 126G 1% /dev/shm tmpfs 5.0M 12K 5.0M 1% /run/lock tmpfs 126G 0 126G 0% /sys/fs/cgroup /dev/sda1 922M 487M 372M 57% /boot /dev/sdb2 7.3T 143G 7.2T 2% /home /dev/sdb1 22T 2.3T 20T 11% /mnt/data tmpfs 26G 40K 26G 1% /run/user/1002

df -h $TMPDIR Filesystem Size Used Avail Use% Mounted on /dev/sdb1 22T 2.3T 20T 11% /mnt/data

df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/sda2 213G 39G 174G 19% /

Is it $TMP itself that is the problem perhaps?

Jonas

leepc12 commented 6 years ago

Can you run the following?

$ ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1
leepc12 commented 6 years ago

Yeah, that sambamba error usually occurs when you don't have enough space in $TMP $TMP /tmp. If you run multiple pipelines or jobs at the same time your temp directory can quickly fills up.

jonasungerback commented 6 years ago

Here it comes but if it is a memory problem, why does it happen specifically with the -no_dup_removal option activated? That makes little sense to me.

ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 total 20G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 10 21:55 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 10 22:12 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 10 15:01 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 2.7G Jan 11 01:08 NALM6_index_1.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 11 01:08 NALM6_index_1.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:16 NALM6_index_1.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 11 01:18 NALM6_index_1.trim.filt.tn5.tagAlign.gz

leepc12 commented 6 years ago

-no_dup_removal was actually added for our lab's internal use, we cannot guarantee that the pipeline works with it for your sample. If you tried it because you got error in deduping step, please remove it. Error in the deduping step is clearly a disk space (or memory) problem. Please see https://github.com/biod/sambamba/issues/218

Please define $TMP $TMPDIR in your ~/.bashrc to somewhere fast/local/big storage and try again. About the second error (spr). I am not sure... Can you remove all *.tagAlign.gz in the output directory and try again? Please run 1 pipeline when your cluster is idle.

$ cd [WORK_DIR]
$ find -name '*.tagAlign.gz' -delete
jonasungerback commented 6 years ago

Thanks, I'll try that and let you know how it worked out. It is very much appreciated that you take this time to help out.

Jonas

jonasungerback commented 6 years ago

I understand that it may be a bit tricky to get -no_dup_removal to work if it was specifically designed for your lab but I think it would be a great feature to have, especially when running single-end data and this is where it seems to break down with that option. It seems to work well with paired-end data and the option enabled however. I have set both $TMP and $TMPDIR so I am fairly sure memory isn't the problem but when I run the command with the following option.

Anyway, I ran the $ find -name '*.tagAlign.gz' -delete and tried to re-run and got the same error:

`shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz’: end of file Task failed: Program & line : '/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules/postalign_bed.bds', line 152 Task Name : 'spr rep1' Task ID : 'atac.bds.20180126_084255_226_parallel_44/task.postalign_bed.spr_rep1.line_152.id_16' Task PID : '41253' Task hint : 'if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bio' Task resources : 'cpus: -1 mem: -1.0 B wall-timeout: 8640000' State : 'ERROR' Dependency state : 'ERROR' Retries available : '1' Input files : '[/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz]' Output files : '[/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.pr1.tagAlign.gz, /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr2/NALM6_index_1.trim.filt.tn5.pr2.tagAlign.gz]' Script file : '/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/atac.bds.20180126_084255_226_parallel_44/task.postalign_bed.spr_rep1.line_152.id_16.sh' Exit status : '1' StdErr (10 lines) : shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz’: end of file

Fatal error: /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds, line 662, pos 4. Task/s failed.`

It looks to me like the file starts to generate but then is terminated prematurely which I assume the EOF error comes from. I would expect the tagAlign-files to be 2-3 times the size of this when the duplicates are not removed.

ls -l rep1/ total 23G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 25 23:25 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 25 23:29 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 2.8G Jan 26 08:34 NALM6_index_1.trim.dupmark.bam -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 25 17:27 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 2.7G Jan 26 01:52 NALM6_index_1.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 26 01:52 NALM6_index_1.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 161M Jan 26 08:49 NALM6_index_1.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 26 08:52 NALM6_index_1.trim.filt.tn5.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 26 08:40 NALM6_index_1.trim.nodup.bam.bai

leepc12 commented 6 years ago

Let's check if that tagAlign file is good.

$ OUT_DIR=/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se
$ TA=$OUT_DIR/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz
$ zcat $TA | wc -l
$ zcat $TA | head
$ zcat $TA | tail
$ tail -n10000 $OUT_DIR/qc/*/*.qc
jonasungerback commented 6 years ago

Here is one of the samples in the triplicates but the other look very similar.

`jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ zcat $TA | wc -l 69938309

jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ zcat $TA | head chr1 10456 10526 N 1000 + chr1 11409 11446 N 1000 + chr1 15185 15250 N 1000 - chr1 26256 26290 N 1000 + chr1 40876 40947 N 1000 + chr1 40884 40955 N 1000 + chr1 40930 41001 N 1000 - chr1 41047 41116 N 1000 - chr1 41110 41182 N 1000 + chr1 41381 41426 N 1000 +

jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ zcat $TA | tail chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16569 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 + chrM 16539 16571 N 1000 +

jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ tail -n10000 $OUT_DIR/qc//.qc tail: cannot open '/qc//.qc' for reading: No such file or directory

jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ OUT_DIR=/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/

jonas@atlas /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1 $ tail -n10000 $OUT_DIR/qc//.qc==> /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep1/NALM6_index_1.trim.flagstat.qc <== 310113958 + 0 in total (QC-passed reads + QC-failed reads) 191843245 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 307957552 + 0 mapped (99.30%:N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A:N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A:N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)

==> /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep2/NALM6_index_3.trim.flagstat.qc <== 433469132 + 0 in total (QC-passed reads + QC-failed reads) 270786919 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 431225345 + 0 mapped (99.48%:N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A:N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A:N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)

==> /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se//qc/rep3/NALM6_index_5.trim.flagstat.qc <== 392661156 + 0 in total (QC-passed reads + QC-failed reads) 244931921 + 0 secondary 0 + 0 supplementary 0 + 0 duplicates 390512774 + 0 mapped (99.45%:N/A) 0 + 0 paired in sequencing 0 + 0 read1 0 + 0 read2 0 + 0 properly paired (N/A:N/A) 0 + 0 with itself and mate mapped 0 + 0 singletons (N/A:N/A) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5) `

jonasungerback commented 6 years ago

I know this is outside the scope of this thread but I have a more mundane and hopefully easier question to answer. I have been trying to play around a bit with different -mapq_thresh scores. Even if I set it to -mapq_thresh 10 I still both get a qc report on a value 30 and it seems like the bam file is filtered on this value as well. Am I do something wrong when giving it this option. Here is an example of that command: bds /mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species mm10 -enable_idr -no_xcor -auto_detect_adapter -mapq_thresh 10 -multimapping 4 -rm_chr_from_tag mito -nth 24 -fastq1_1 R11.fastq.gz -fastq1_2 R12.fastq.gz -fastq2_1 R21.fastq.gz -fastq2_2 R22.fastq.gz

leepc12 commented 6 years ago

For multimapping reads, mapq_thresh is fixed at 30. Sorry, I will add this to README. For the EOF error, can you extract the actual command line for the shuf command? You can get it in the log file where you found the following error message.

`shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz’: end of file Task failed: Program & line : '/mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules/postalign_bed.bds', line 152 Task Name : 'spr rep1' Task ID : 'atac.bds.20180126_084255_226_parallel44/task.postalign bed.spr_rep1.line_152.id_16' Task PID : '41253' Task hint : 'if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bio' Task resources : 'cpus: -1 mem: -1.0 B wall-timeout: 8640000' State : 'ERROR' Dependency state : 'ERROR' Retries available : '1' Input files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1. trim.filt.tn5.tagAlign.gz]' Output files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/ NALM6_index_1.trim.filt.tn5.pr1.tagAlign.gz, /mnt/data/common/jonas/ atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_duprem se/align/pseudo_reps/rep1/pr2/NALM6_index_1.trim.filt.tn5.pr2.tagAlign.gz]' Script file : '/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/atac.bds.20180126_084255_226_parallel_44/task. postalign_bed.spr_rep1.line_152.id_16.sh' Exit status : '1' StdErr (10 lines) : shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz’: end of file

Thanks,

Jin

On Thu, Feb 1, 2018 at 11:25 PM, Jonas notifications@github.com wrote:

I know this is outside the scope of this thread but I have a more mundane and hopefully easier question to answer. I have been trying to play around a bit with different -mapq_thresh scores. Even if I set it to -mapq_thresh 10 I still both get a qc report on a value 30 and it seems like the bam file is filtered on this value as well. Am I do something wrong when giving it this option. Here is an example of that command: bds /mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species mm10 -enable_idr -no_xcor -auto_detect_adapter -mapq_thresh 10 -multimapping 4 -rm_chr_from_tag mito -nth 24 -fastq1_1 R11.fastq.gz -fastq1_2 R12.fastq.gz -fastq2_1 R21.fastq.gz -fastq2_2 R22.fastq.gz

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/89#issuecomment-362507286, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_NkDpGmlypaLEB5n-xhlpWZj_XL7ks5tQriAgaJpZM4RUby9 .

akundaje commented 6 years ago

Jin,

If you provide a MAPQ threshold parameter, you should use the value the user provides for filtering and QC.

Anshul.

On Feb 8, 2018 6:50 AM, "Jin Lee" notifications@github.com wrote:

For multimapping reads, mapq_thresh is fixed at 30. Sorry, I will add this to README. For the EOF error, can you extract the actual command line for the shuf command? You can get it in the log file where you found the following error message.

`shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1. trim.filt.tn5.tagAlign.gz’: end of file Task failed: Program & line : '/mnt/data/bioinfo_toolsand refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules/ postalign_bed.bds', line 152 Task Name : 'spr rep1' Task ID : 'atac.bds.20180126_084255_226_parallel44/task.postalign bed.spr_rep1.line_152.id_16' Task PID : '41253' Task hint : 'if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bio' Task resources : 'cpus: -1 mem: -1.0 B wall-timeout: 8640000' State : 'ERROR' Dependency state : 'ERROR' Retries available : '1' Input files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1. trim.filt.tn5.tagAlign.gz]' Output files : '[/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/ NALM6_index_1.trim.filt.tn5.pr1.tagAlign.gz, /mnt/data/common/jonas/ atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_duprem se/align/pseudo_reps/rep1/pr2/NALM6_index_1.trim.filt.tn5. pr2.tagAlign.gz]' Script file : '/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/atac.bds.20180126_084255_226_parallel_44/task. postalign_bed.spr_rep1.line_152.id_16.sh' Exit status : '1' StdErr (10 lines) : shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATACENCODE pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1. trim.filt.tn5.tagAlign.gz’: end of file

Thanks,

Jin

On Thu, Feb 1, 2018 at 11:25 PM, Jonas notifications@github.com wrote:

I know this is outside the scope of this thread but I have a more mundane and hopefully easier question to answer. I have been trying to play around a bit with different -mapq_thresh scores. Even if I set it to -mapq_thresh 10 I still both get a qc report on a value 30 and it seems like the bam file is filtered on this value as well. Am I do something wrong when giving it this option. Here is an example of that command: bds /mnt/data/bioinfo_tools_and_refs/bioinfotools/kundajelab ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/atac.bds -species mm10 -enable_idr -no_xcor -auto_detect_adapter -mapq_thresh 10 -multimapping 4 -rm_chr_from_tag mito -nth 24 -fastq1_1 R11.fastq.gz -fastq1_2 R12.fastq.gz -fastq2_1 R21.fastq.gz -fastq2_2 R22.fastq.gz

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/89# issuecomment-362507286, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_NkDpGmlypaLEB5n- xhlpWZj_XL7ks5tQriAgaJpZM4RUby9 .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/89#issuecomment-364134686, or mute the thread https://github.com/notifications/unsubscribe-auth/AAI7ETSBRkXeMntFSQaslc65ZR2bh6oCks5tSwmqgaJpZM4RUby9 .

jonasungerback commented 6 years ago

`# SYS command. line 154

if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/.:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

SYS command. line 157

nlines=$( zcat /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | wc -l )

SYS command. line 158

nlines=$(( (nlines + 1) / 2 ))

SYS command. line 162

zcat /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | shuf --random-source=/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | split -d -l $((nlines)) - /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.

SYS command. line 165

gzip -nc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.00 > /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.pr1.tagAlign.gz

SYS command. line 166

rm -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.00

SYS command. line 167

gzip -nc /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.01 > /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr2/NALM6_index_1.trim.filt.tn5.pr2.tagAlign.gz

SYS command. line 168

rm -f /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.01

SYS command. line 170

TASKTIME=$[$(date +%s)-${STARTTIME}]; echo "Task has finished (${TASKTIME} seconds)."; sleep 0`

Input files: `

  /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz

Output files: `

  /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.pr1.tagAlign.gz /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr2/NALM6_index_1.trim.filt.tn5.pr2.tagAlign.gz

`

`

leepc12 commented 6 years ago

@akundaje sorry we already discussed this issue in the ataqc channel on slack. @jonasungerback let me correct this. when multimapping>0 mapq_thresh is ignored (not fixed at 30).

leepc12 commented 6 years ago

Please run the following and post outputs here.

if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/.:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

nlines=$( zcat /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | wc -l )

nlines=$(( (nlines + 1) / 2 ))

zcat /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | shuf --random-source=/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | split -d -l $((nlines)) - /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps/rep1/pr1/NALM6_index_1.trim.filt.tn5.
jonasungerback commented 6 years ago

Thanks then I know that those two cannot be combined.

What comes up when I put the above commands in is the following:

shuf: ‘/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz’: end of file

leepc12 commented 6 years ago

Please run the following:

if [[ -f $(which conda) && $(conda env list | grep bds_atac | wc -l) != "0" ]]; then source activate bds_atac; sleep 5; fi; export PATH=/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/.:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/modules:/mnt/data/bioinfo_tools_and_refs/bioinfo_tools/kundajelab_ENCODE_atac_dnase_pipeline/atac_dnase_pipelines/utils:${PATH}:/bin:/usr/bin:/usr/local/bin:${HOME}/.bds; set -o pipefail; STARTTIME=$(date +%s)

nlines=$( zcat /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz | wc -l )

nlines=$(( (nlines + 1) / 2 ))

echo $nlines

ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz

ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/*
jonasungerback commented 6 years ago

Again, thank you for your effort. What are we looking for and maybe I can be of more assistance. Here's the output from the commands:

echo $nlines 34969155 `ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 30 12:41 /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz (bds_atac) jonas@atlas ~ $ (bds_atac) jonas@atlas ~ $ (bds_atac) jonas@atlas ~ $ ls -l /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/* /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/pseudo_reps: total 0 drwxrwxr-x 4 jonas sigvardsson 40 Jan 30 12:41 rep1 drwxrwxr-x 4 jonas sigvardsson 40 Jan 30 14:06 rep2 drwxrwxr-x 4 jonas sigvardsson 40 Jan 30 13:34 rep3

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1: total 20G -rw-rw-r-- 1 jonas sigvardsson 12G Jan 30 11:25 NALM6_index_1.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.4M Jan 30 11:29 NALM6_index_1.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 4.7G Jan 30 08:32 NALM6_index_1.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 2.7G Jan 30 12:31 NALM6_index_1.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 30 12:31 NALM6_index_1.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 161M Jan 30 12:39 NALM6_index_1.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 161M Jan 30 12:41 NALM6_index_1.trim.filt.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep2: total 27G -rw-rw-r-- 1 jonas sigvardsson 16G Jan 30 12:31 NALM6_index_3.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.6M Jan 30 12:37 NALM6_index_3.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 6.7G Jan 30 08:49 NALM6_index_3.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 3.6G Jan 30 13:53 NALM6_index_3.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.3M Jan 30 13:53 NALM6_index_3.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 170M Jan 30 14:03 NALM6_index_3.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 171M Jan 30 14:06 NALM6_index_3.trim.filt.tn5.tagAlign.gz

/mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep3: total 24G -rw-rw-r-- 1 jonas sigvardsson 15G Jan 30 12:06 NALM6_index_5.trim.bam -rw-rw-r-- 1 jonas sigvardsson 2.5M Jan 30 12:12 NALM6_index_5.trim.bam.bai -rw-rw-r-- 1 jonas sigvardsson 5.9G Jan 30 08:42 NALM6_index_5.trim.fastq.gz -rw-rw-r-- 1 jonas sigvardsson 3.2G Jan 30 13:22 NALM6_index_5.trim.filt.bam -rw-rw-r-- 1 jonas sigvardsson 6.2M Jan 30 13:22 NALM6_index_5.trim.filt.bam.bai -rw-rw-r-- 1 jonas sigvardsson 177M Jan 30 13:31 NALM6_index_5.trim.filt.tagAlign.gz -rw-rw-r-- 1 jonas sigvardsson 178M Jan 30 13:34 NALM6_index_5.trim.filt.tn5.tagAlign.gz`

leepc12 commented 6 years ago

This is weird, end of file error in shuf occurs only when a file used for --random-source is too short (< several bytes). Can you send me /mnt/data/common/jonas/atacseq/NALM6_ATAC_ENCODE_pipeline/out_NALM6_no_dup_rem_se/align/rep1/NALM6_index_1.trim.filt.tn5.tagAlign.gz? You can upload it to your google drive or dropbox and share it with us.

jonasungerback commented 6 years ago

I apologize for late answer, it's been a busy week. Folder has been created on Dropbox and files uploaded.

leepc12 commented 6 years ago

@jonasungerback sorry for long delay. please get the latest pipeline and try again with a -no_random_source flag.

jonasungerback commented 6 years ago

Thank you! Sadly the same problem persists.

leepc12 commented 6 years ago

@jonasungerback please check if your modules/postalign_bed.bds has no_random_source variable.

jonasungerback commented 6 years ago

It is not there. is that good or bad?

leepc12 commented 6 years ago

@jonasungerback did you git pull the latest fix commit? did you run with -no_random_source?

jonasungerback commented 6 years ago

Now I have and it looks like it is working. I will run some more thorough tests but thank you so very much for all your kind help.