databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
51 stars 13 forks source link

running PEPATAC example shows float(pm.get_stat("Raw_reads")) error, float() argument must be a string or a real number, not 'NoneType' #258

Closed yitaoyitaoyitao closed 5 months ago

yitaoyitaoyitao commented 7 months ago

I got an error when running PEPATAC example: File "/home/xiutao/software/pepatac/pipelines/pepatac.py", line 1101, in check_alignment_genome rr = float(pm.get_stat("Raw_reads")) TypeError: float() argument must be a string or a real number, not 'NoneType' PEPATAC_log.txt

donaldcampbelljr commented 7 months ago

I believe the bug related to this issue has been fixed in the pre-release version of pypiper: https://github.com/databio/pypiper/releases/tag/v0.14.0a1

Herais commented 6 months ago

I think I still have the Raw_reads error, using pypiper v0.14.0a1

Version log:

Arguments passed to pipeline:

Initialized Pipestat Object:


['/wynton/home/alexanian/veexu/env_py3.10b/pepatac/pipelines', '/wynton/home/alexanian/veexu/Python-3.10.13/lib/python310.zip', '/wynton/home/alexanian/veexu/Python-3.10.13/lib/python3.10', '/wynton/home/alexanian/veexu/Python-3.10.13/lib/python3.10/lib-dynload', '/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages', '/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/piper-0.13.2-py3.10.egg', '/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/piper-0.14.0a1-py3.10.egg', '/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/pipestat-0.6.0a8-py3.10.egg'] Using default schema: /wynton/home/alexanian/veexu/env_py3.10b/pepatac/pipelines/pipestat_output_schema.yaml Local input file: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/00_fastq_raw/AZ_ATAC_sample6_S6_R1_001.fastq.gz Local input file: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/00_fastq_raw/AZ_ATAC_sample6_S6_R2_001.fastq.gz These results exist for 'DEFAULT_SAMPLE_NAME': File_mb These results exist for 'DEFAULT_SAMPLE_NAME': Read_type Result successfully reported? False These results exist for 'DEFAULT_SAMPLE_NAME': Genome Result successfully reported? False

Merge/link and fastq conversion: (12-27 16:39:04) elapsed: 0.0 TIME

Number of input file sets: 2 Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1.fastq.gz

ln -sf /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/00_fastq_raw/AZ_ATAC_sample6_S6_R1_001.fastq.gz /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1.fastq.gz (4173566)


Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
PID: 4173566; Command: ln; Return code: 0; Memory used: 0.009GB

Local input file: '/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1.fastq.gz' Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2.fastq.gz

ln -sf /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/00_fastq_raw/AZ_ATAC_sample6_S6_R2_001.fastq.gz /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2.fastq.gz (4173567)


Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
PID: 4173567; Command: ln; Return code: 0; Memory used: 0.009GB

Local input file: '/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2.fastq.gz' Found .fastq.gz file Found .fq.gz file; no conversion necessary Found .fastq.gz file Found .fq.gz file; no conversion necessary Target exists: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1.fastq.gz
Target exists: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2.fastq.gz

Adapter trimming: (12-27 16:39:05) elapsed: 0.0 TIME

Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim.fastq

skewer -f sanger -t 1 -m pe -x /wynton/home/alexanian/veexu/env_py3.10b/pepatac/tools/NexteraPE-PE.fa --quiet -o /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38 /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1.fastq.gz /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2.fastq.gz (4173576)

.--. .-.
: .--': :.-.
`. `. : `'.' .--. .-..-..-. .--. .--.
_`, :: . `.' '_.': `; `; :' '_.': ..'
`.__.':_;:_;`.__.'`.__.__.'`.__.':_;
skewer v0.2.2 [April 4, 2016]
Parameters used:
-- 3' end adapter sequences in file (-x):   /wynton/home/alexanian/veexu/env_py3.10b/pepatac/tools/NexteraPE-PE.fa
A:  AGATGTGTATAAGAGACAG
B:  AGATGTGTATAAGAGACAG
C:  TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG
D:  CTGTCTCTTATACACATCTGACGCTGCCGACGA
E:  GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
F:  CTGTCTCTTATACACATCTCCGAGCCCACGAGA
-- maximum error ratio allowed (-r):    0.100
-- maximum indel error ratio allowed (-d):  0.030
-- minimum read length allowed after trimming (-l): 18
-- file format (-f):        Sanger/Illumina 1.8+ FASTQ 
Wed Dec 27 16:39:05 2023 >> started

Wed Dec 27 17:28:46 2023 >> done (2980.977s) 26901947 read pairs processed; of these: 570 ( 0.00%) short read pairs filtered out after trimming by size control 1 ( 0.00%) empty read pairs filtered out after trimming by size control 26901376 (100.00%) read pairs available; of these: 14926038 (55.48%) trimmed read pairs available after processing 11975338 (44.52%) untrimmed read pairs available after processing log has been saved to "/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38-trimmed.log". Command completed. Elapsed time: 0:49:41. Running peak memory: 0.009GB.
PID: 4173576; Command: skewer; Return code: 0; Memory used: 0.007GB

mv /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38-trimmed-pair1.fastq /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim.fastq (10687)


Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
PID: 10687; Command: mv; Return code: 0; Memory used: 0.009GB

mv /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38-trimmed-pair2.fastq /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2_trim.fastq (10689)


Command completed. Elapsed time: 0:00:00. Running peak memory: 0.009GB.
PID: 10689; Command: mv; Return code: 0; Memory used: 0.009GB

Evaluating read trimming

Trimmed_reads 53802752 RES

pipestat_modified_time 2023-12-27 17:29:21 RES Missing stat 'Raw_reads' Can't calculate trim loss rate without raw read result. Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim_fastqc.html

fastqc --noextract --outdir /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/fastqc /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim.fastq (10903)

null
Started analysis of run_11_pepatac_hg38_R1_trim.fastq
Approx 5% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 10% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 15% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 20% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 25% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 30% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 35% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 40% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 45% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 50% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 55% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 60% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 65% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 70% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 75% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 80% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 85% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 90% complete for run_11_pepatac_hg38_R1_trim.fastq
Approx 95% complete for run_11_pepatac_hg38_R1_trim.fastq
Analysis complete for run_11_pepatac_hg38_R1_trim.fastq

Command completed. Elapsed time: 0:02:47. Running peak memory: 0.504GB.
PID: 10903; Command: fastqc; Return code: 0; Memory used: 0.504GB

FastQC report r1 ../run_11_pepatac_hg38_R1_trim_fastqc.html FastQC report r1 None PEPATAC RES

pipestat_modified_time 2023-12-27 17:32:08 RES Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2_trim_fastqc.html

fastqc --noextract --outdir /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/fastqc /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2_trim.fastq (11617)

null
Started analysis of run_11_pepatac_hg38_R2_trim.fastq
Approx 5% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 10% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 15% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 20% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 25% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 30% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 35% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 40% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 45% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 50% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 55% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 60% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 65% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 70% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 75% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 80% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 85% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 90% complete for run_11_pepatac_hg38_R2_trim.fastq
Approx 95% complete for run_11_pepatac_hg38_R2_trim.fastq
Analysis complete for run_11_pepatac_hg38_R2_trim.fastq

Command completed. Elapsed time: 0:03:14. Running peak memory: 0.514GB.
PID: 11617; Command: fastqc; Return code: 0; Memory used: 0.514GB

FastQC report r2 ../run_11_pepatac_hg38_R2_trim_fastqc.html FastQC report r2 None PEPATAC RES

pipestat_modified_time 2023-12-27 17:35:23 RES

Prealignments (12-27 17:35:23) elapsed: 3378.0 TIME

Map to rCRSd (12-27 17:35:23) elapsed: 0.0 TIME

Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/prealignments/rCRSd_bt2

mkfifo /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/prealignments/rCRSd_bt2 (15655)


Command completed. Elapsed time: 0:00:00. Running peak memory: 0.514GB.
PID: 15655; Command: mkfifo; Return code: 0; Memory used: 0.001GB

Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_bt_aln_summary.log,/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq.gz

perl /wynton/home/alexanian/veexu/env_py3.10b/pepatac/tools/filter_paired_fq.pl /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/prealignments/rCRSd_bt2 /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim.fastq /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R2_trim.fastq /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R1.fq /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq (15656)


Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_bt_aln_summary.log,/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq.gz

(bowtie2 -p 1 -k 1 -D 20 -R 3 -N 1 -L 20 -i S,1,0.50 -x /wynton/home/alexanian/veexu/env_py3.10b/genome_folder/alias/rCRSd/bowtie2_index/default/rCRSd --rg-id /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38 -U /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_R1_trim.fastq --un /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/prealignments/rCRSd_bt2 > /dev/null) 2>/wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_bt_aln_summary.log (15657)

not gzipping output

Command completed. Elapsed time: 0:13:17. Running peak memory: 0.514GB.
PID: 15657; Command: bowtie2; Return code: 0; Memory used: 0.061GB

grep 'aligned exactly 1 time' /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_bt_aln_summary.log | awk '{print $1}'

Aligned_reads_rCRSd 8063222.0 RES

pipestat_modified_time 2023-12-27 17:48:41 RES

Alignment_rate_rCRSd 14.99 RES

pipestat_modified_time 2023-12-27 17:48:41 RES

Compress all unmapped read files (12-27 17:48:41) elapsed: 798.0 TIME

4031611 reads skipped 0 reads lost Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R1.fq.gz

gzip -f /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R1.fq (19889)


Command completed. Elapsed time: 0:10:34. Running peak memory: 0.514GB.
PID: 19889; Command: gzip; Return code: 0; Memory used: 0.002GB

Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq.gz

gzip -f /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq (25396)


Command completed. Elapsed time: 0:11:17. Running peak memory: 0.514GB.
PID: 25396; Command: gzip; Return code: 0; Memory used: 0.002GB

Map to genome (12-27 18:10:40) elapsed: 1319.0 TIME

Target to produce: /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_sort_dedup.bam

bowtie2 -p 1 --very-sensitive -X 2000 --rg-id /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38 -x /wynton/home/alexanian/veexu/env_py3.10b/genome_folder/alias/hg38/bowtie2_index/default/hg38 -1 /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R1.fq.gz -2 /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_rCRSd_unmap_R2.fq.gz | samtools view -bS - -@ 1 | samtools sort - -@ 1 -T /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38/aligned_hg38/tmp0g2ho1rj -o /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_temp.bam (29705,29733,29736)

22869765 reads; of these:
22869765 (100.00%) were paired; of these:
1001231 (4.38%) aligned concordantly 0 times
18247424 (79.79%) aligned concordantly exactly 1 time
3621110 (15.83%) aligned concordantly >1 times
----
1001231 pairs aligned concordantly 0 times; of these:
8258 (0.82%) aligned discordantly 1 time
----
992973 pairs aligned 0 times concordantly or discordantly; of these:
1985946 mates make up the pairs; of these:
1595590 (80.34%) aligned 0 times
252703 (12.72%) aligned exactly 1 time
137653 (6.93%) aligned >1 times
96.51% overall alignment rate
[bam_sort_core] merging from 22 files and 1 in-memory blocks...

Command completed. Elapsed time: 6:45:54. Running peak memory: 3.523GB.
PID: 29705; Command: bowtie2; Return code: 0; Memory used: 3.523GB
PID: 29733; Command: samtools; Return code: 0; Memory used: 0.01GB
PID: 29736; Command: samtools; Return code: 0; Memory used: 0.913GB

samtools view -b -q 10 -@ 1 -U /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_fail_qc.bam -f 2 /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_temp.bam > /wynton/scratch/veexu/az_ATAC_hLung_20231115/run_11_pepatac_hg38_sort.bam (236463)


Command completed. Elapsed time: 0:06:56. Running peak memory: 3.523GB.
PID: 236463; Command: samtools; Return code: 0; Memory used: 0.013GB

Missing stat 'Raw_reads' Traceback (most recent call last): File "/wynton/home/alexanian/veexu/env_py3.10b/pepatac/pipelines/pepatac.py", line 2788, in sys.exit(main()) File "/wynton/home/alexanian/veexu/env_py3.10b/pepatac/pipelines/pepatac.py", line 1114, in main pm.run([cmd, cmd2], rmdup_bam, follow=check_alignment_genome) File "/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/pypiper/manager.py", line 1090, in run call_follow() File "/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/pypiper/manager.py", line 944, in call_follow follow() File "/wynton/home/alexanian/veexu/env_py3.10b/pepatac/pipelines/pepatac.py", line 1103, in check_alignment_genome rr = float(pm.get_stat("Raw_reads")) TypeError: float() argument must be a string or a real number, not 'NoneType' Child process 15656 (perl) was already terminated. Starting cleanup: 3 files; 3 conditional files for cleanup

Cleaning up flagged intermediate files. . .

Conditional flag found: []

These conditional files were left in place:

Pipeline failed at: (12-28 01:05:07) elapsed: 24867.0 TIME

Total time: 8:26:03 Failure reason: Pipeline failure. See details above. Exception ignored in atexit callback: <bound method PipelineManager._exit_handler of <pypiper.manager.PipelineManager object at 0x1467b8e49930>> Traceback (most recent call last): File "/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/pypiper/manager.py", line 2218, in _exit_handler self.fail_pipeline(Exception("Pipeline failure. See details above.")) File "/wynton/home/alexanian/veexu/env_py3.10b/lib/python3.10/site-packages/pypiper/manager.py", line 2062, in fail_pipeline raise exc Exception: Pipeline failure. See details above.

donaldcampbelljr commented 6 months ago

Hello @Herais,

Thank you for letting us know the alpha release is not working for you.

We released new versions of PEPATAC (v0.11.0), Pypiper (0.14.0), Pipestat (0.6.0) and Looper (1.6.0) on Dec 22nd. I recommend doing a reinstall of those specific packages and trying again with the newest PEPATAC release.

Herais commented 6 months ago

Hello @Herais,

Thank you for letting us know the alpha release is not working for you.

We released new versions of PEPATAC (v0.11.0), Pypiper (0.14.0), Pipestat (0.6.0) and Looper (1.6.0) on Dec 22nd. I recommend doing a reinstall of those specific packages and trying again with the newest PEPATAC release.

Just to report that the pipeline worked great (running pepatac in native mode) after the suggested upgrades. Thanks!

donaldcampbelljr commented 5 months ago

Excellent. I will go ahead and close this issue.