Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
360 stars 53 forks source link

Empty result with using Pacbio HiFi reads #111

Closed Axolotl233 closed 3 years ago

Axolotl233 commented 3 years ago

Hi,

Recently I tested nextdenovo v2.4.0 with Fragaria x ananassa hifi reads from NCBI sra database (https://www.ncbi.nlm.nih.gov/sra/SRR11606867). the program end with empty result but no error was reported to STDERR and logfile.

i checked work directory and found the out put file cns.filt.dovt.ovl and cns.filt.dovt.ovl.blof 02.cns_align.sh.work/cns_align* were empty, so i think this is primary reason may be caused empty result in final.

Also, i have complete assembly using Hifiasm with same data, so i think error not caused by incorrect reads type.

my configfile is described below and log file was upload in appendix. please tell me what's wrong and how should i do to solve it, thank you~

[General]
job_type = local
job_prefix = nextDenovo
task = all # 'all', 'correct', 'assemble'
rewrite = yes # yes/no

deltmp = yes
rerun = 10
parallel_jobs = 4
input_type = raw
read_type = hifi
input_fofn = ./input.fofn
workdir = ./01.assembly

[correct_option]
read_cutoff = 5k
genome_size = 830m
seed_depth = 30
pa_correction = 8
sort_options = -m 5g -t 10
minimap2_options_raw =  -t 10
correction_options = -p 10

[assemble_option]
minimap2_options_cns =  -t 10
nextgraph_options = -a 1

pid2184948.log.txt

moold commented 3 years ago

Hi, could you paste the content of file /data/01/user112/project/z.other/AssembleTest/Hifi_test/nextdenovo/01.assembly/02.cns_align/02.cns_align.sh.work/cns_align00/nextDenovo.sh.e to here? I wonder why its result is empty?

Axolotl233 commented 3 years ago

Hi, The file you need has been uploaded in appendix.please check it

nextDenovo.sh.e.txt

moold commented 3 years ago

It seems the input file does not contain any valid sequences, so could you check the output and log files for subjob /data/01/user112/project/z.other/AssembleTest/Hifi_test/nextdenovo/01.assembly/02.cns_align/01.split_seed.sh.work/split_seed0/nextDenovo.sh

Axolotl233 commented 3 years ago

My input is fastq format, and output of /data/01/user112/project/z.other/AssembleTest/Hifi_test/nextdenovo/01.assembly/02.cns_align/01.split_seed.sh.work/split_seed0/nextDenovo.sh like below: >8 1546896 1.000000 pid=9D5:@.RKSKZZe6if@t\ZW_b9f\IbYM\Il,f\Xz_EV9kUgIMR1tiA\shSKMU^O1wm\UA+)=]V9kfeTB-GYWO8WGFPVW+]UBUDbUUuYZhOzNsgcd[W$^D@5AQ[iXfHohREM5;YQV;yNMSMfNQ> @SRR11606867.1.1730 1730 length=21991ATGTTTGAGGAATATACAGCATGATCTAGTTAATCATTGCCCGATGTCACATTTGAAGACATGTTAAAGAGCATTGACATACTAAGTTATTGAAACTTCTATGATTAGTAAAAATCAGGAAGAGCCTTATGATT it seems abnormal, i will transformat it to fasta and rerun it again, hope this can solve problem.

Axolotl233 commented 3 years ago

Hi, when i trans format fastq of original reads to fasta , the program worked normally and got results. so i think thats a small bug when split input, fastq format of hifi reads can not be recognized correctly because it not go through correct step.

moold commented 3 years ago

Yes, I will fix it in next release.