zjshi / gt-pro

MIT License
23 stars 7 forks source link

Error writing output xxx_gtpro__20190723_881species.tsv.gz cl 1067 #42

Closed yejunbin closed 2 years ago

yejunbin commented 2 years ago

Error happen after [Stats] 1852819 snps, ...., for xx.fq.gz

"Error writing output xxx_gtpro__20190723_881species.tsv.gz cl 1067"

zjshi commented 2 years ago

Thanks for using GT-Pro! Could you please give a bit more details about your running?

yejunbin commented 2 years ago

Thanks for using GT-Pro! Could you please give a bit more details about your running?

The run command is :

sample="ADBB"

/Disk1/github/gt-pro/GT_Pro genotype -d /Disk2/database/GT-pro/20190723_881species -o ${sample}_gt_pro \
        data/${sample}/${sample}.R1.clear.fastq.gz \
        data/${sample}/${sample}.R2.clear.fastq.gz  

The Log:

gt_pro /Disk2/database/GT-pro/20190723_881species 28 no_overwrite 1642347102634: [Info] Starting to load DB: /Disk2/database/GT-pro/20190723_881species 1642347102634: [Info] MMAPPING /Disk2/database/GT-pro/20190723_881species_optimized_db_snps.bin 1642347102737: [Info] MMAPPING /Disk2/database/GT-pro/20190723_881species_optimized_db_kmer_index.bin 1642347103076: [Info] Using -l 32 -m 36 as optimal for system RAM 1642347103076: [Info] MMAPPING /Disk2/database/GT-pro/20190723_881species_optimized_db_mmer_bloom_36.bin 1642347103321: [Info] MMAPPING /Disk2/database/GT-pro/20190723_881species_optimized_db_lmer_index_32.bin 1642347104626: [Info] Done with init for optimized DB with 2856121626 kmers. That took 1 seconds. 1642347104761: [Info] Waiting for all readers to quiesce 1642347107396: [Progress] 1.01 million reads scanned after 2 seconds, and 0 files output. 1642347108599: [Progress] 2.03 million reads scanned after 3 seconds, and 0 files output. 1642347109894: [Progress] 3.17 million reads scanned after 5 seconds, and 0 files output. 1642347111184: [Progress] 4.24 million reads scanned after 6 seconds, and 0 files output. 1642347112662: [Progress] 5.26 million reads scanned after 8 seconds, and 0 files output. 1642347113908: [Progress] 6.29 million reads scanned after 9 seconds, and 0 files output. 1642347115251: [Progress] 7.31 million reads scanned after 10 seconds, and 0 files output. 1642347116372: [Progress] 8.33 million reads scanned after 11 seconds, and 0 files output. 1642347117515: [Progress] 9.35 million reads scanned after 12 seconds, and 0 files output. 1642347118709: [Progress] 10.38 million reads scanned after 14 seconds, and 0 files output. 1642347119837: [Progress] 11.4 million reads scanned after 15 seconds, and 0 files output. 1642347121145: [Progress] 12.42 million reads scanned after 16 seconds, and 0 files output. ........................................................................................ 1642347235712: [Progress] 104.9 million reads scanned after 131 seconds, and 0 files output. 1642347237107: [Progress] 105.9 million reads scanned after 132 seconds, and 0 files output. 1642347238325: [Progress] 106.93 million reads scanned after 133 seconds, and 0 files output. 1642347239984: [Progress] 107.97 million reads scanned after 135 seconds, and 0 files output. 1642347241856: [Progress] 108.99 million reads scanned after 137 seconds, and 0 files output. 1642347243354: [Progress] 110.01 million reads scanned after 138 seconds, and 0 files output. 1642347243712: [Done] searching is completed for the 56356227 reads input from data/ADBB/ADBB.R2.clear.fastq.gz 1642347245166: [Progress] 111.04 million reads scanned after 140 seconds, and 0 files output. 1642347246367: [Stats] 1419993 snps, 56356227 reads, 30.86 hits/snp, for data/ADBB/ADBB.R2.clear.fastq.gz 1642347246373: [ERROR] Error writing output ADBB_gt_pro.tsv.gz cl 1067 1642347247479: [Progress] 112.71 million reads scanned after 142 seconds, and 1 files output. 1642347247479: [Done] searching is completed for the 56356227 reads input from data/ADBB/ADBB.R1.clear.fastq.gz 1642347249807: [Stats] 1466355 snps, 56356227 reads, 32.43 hits/snp, for data/ADBB/ADBB.R1.clear.fastq.gz 1642347249818: [ERROR] Error writing output ADBB_gt_pro.tsv.gz cl 1067 1642347249865: 112.71 million reads were scanned after 145 seconds Failed for ALL 2 input files.


Some of my samples can get successful results.

yejunbin commented 2 years ago

some samples which get error above can get results after rerunning, but not all of them, many still get errors.

zjshi commented 2 years ago

It could be due to the dependency issues related to writing in compressed format. Would you please try running the program on the uncompressed problematic input files. In your example, gunzip both ADBB.R1.clear.fastq.gz and ADBB.R2.clear.fastq.gz first, then run GT_Pro on ADBB.R1.clear.fastq and ADBB.R2.clear.fastq. Let me know please if this error persists. Thanks!

zjshi commented 2 years ago

Please see this issue for a possible solution: https://github.com/zjshi/gt-pro/issues/44