ndierckx / NOVOPlasty

NOVOPlasty - The organelle assembler and heteroplasmy caller
Other
174 stars 63 forks source link

Issue with input reads. #154

Closed athulmenon closed 3 years ago

athulmenon commented 3 years ago

Hi,

Thanks for this great tool.

I was running mito assembly using Novoplasty. I tried to run it using paired reads and came across the error to contact you. I later run the same fastq files after interleaving it. Now the error shows "Combined files not supported, please use Forward and reverse reads". I tried to use raw reads, trimmed reads, adaptor only trimmed reads but have the same error with all these datasets. I have run Novoplasty with another set of data and it worked like a charm!

Below are the headers of my interleaved fastq file. @E00592:284:HCK7VCCX2:2:1101:9404:1502 1:N:0:NTCACGAT+NGANCTCG NTCACGAT @E00592:284:HCK7VCCX2:2:1101:9404:1502 2:N:0:NTCACGAT+NGANCTCG NTCACGAT @E00592:284:HCK7VCCX2:2:1101:9465:1502 1:N:0:NTCACGAT+NGANCTCG NTCACGAT @E00592:284:HCK7VCCX2:2:1101:9465:1502 2:N:0:NTCACGAT+NGANCTCG NTCACGAT

My config file looks like this. Project:

Project name = F_eq Type = mito Genome Range = 20000-50000 K-mer = 39 Max memory = Extended log = Save assembled reads = Seed Input = MtDNA1.fasta Reference sequence = Variance detection = Chloroplast sequence =

Dataset 1:

Read Length = 151 Insert size = 300 Platform = illumina Single/Paired = PE Combined reads = interleaved.fastq Forward reads = Reverse reads =

Heteroplasmy:

MAF = HP exclude list = PCR-free =

Optional:

Insert size auto = yes Insert Range = Insert Range strict = Use Quality Scores =

Please let me know how to fix this. Thanks in advance. Athul

ndierckx commented 3 years ago

Hi,

So your combined read file always have the paired reads together? I will have a look if I can make NOVOPlasty recognise this format, because I never encountered it before. How come your reads are ordered like that?

You could do already a quick run with the option "Single/Paired = SE"

athulmenon commented 3 years ago

Hi,

Thank you for the suggestion. It ran with SE option. I used BBMAP tool to interleave the read 1 and read 2 into the format.

ndierckx commented 3 years ago

If SE didn't work, I would advice to separate the reads in two files, if you know a bioinformatician that can do it for you, you just need a tiny script for that

travc commented 2 years ago

@ndierckx Supporting interleaved reads like that (with unpaired reads at the end) would be a very nice feature. That format really has a lot of benefits and is pretty standard. bwa even takes it as an input.