Open Mkchouk opened 7 years ago
As the README.md says, you need to first convert your input FastA file(s) to a dazzler database using fasta2DB or fasta2DAM from DAZZ_DB . Use something like
fasta2DAM reads.dam reads.fasta DBsplit -s256 -x1000 reads.dam
Then run daligner on this database to produce alignments in the LAS format. Try
HPC.daligner reads.dam | bash
You need to have the daligner directory in your PATH for this to work.
Afterwards you can run
src/daccord reads.las reads.dam >reads_daccord.fasta
thank you for your response. so i must install Dazzler databases and DALIGNER for LAS alignment files?? thank you
Yes, you need DAZZ_DB and DALIGNER.
the HPC.daligner reads.dam | bash
generated me three .las files.
i run daccord for every file to obtain the corrected reads?
src/daccord reads.1.las reads.dam >reads_daccord.1.fasta
src/daccord reads.2.las reads.dam >reads_daccord.2.fasta
src/daccord reads.3.las reads.dam >reads_daccord.3.fasta
thank you for your response
Yes. Each of the LAS files contains alignments for a block of reads. daccord will output corrected reads for the reads it finds in these files.
Is it possible to clarify the tuning algorithm? There are many mathematical equations and since I am not a mathematician I have a hard time understanding the algorithm generally. thank you
The program should be just fine with it's default settings. If you are looking at a repetitive genome, you may consider setting the -D parameter to twice the average sequencing depth, i.e. use -D60 for a sequencing depth of 30. This will only load the (up to) 60 "best" alignments for each read. You may also consider to run
computeintrinsicqv -d30 reads.db reads.las lasfilteralignments reads.db reads.las
which will create reads_filtered.las . Here the -d switch needs to be set to the average sequencing depth. Afterwards pass reads_filtered.las to daccord.
No, It's not my question. My question is to clarify the daccord algorithm, How does daccord do the long reads error correction? thanks
Is there any particular detail you're intersted in? Any particular issues with understanding the paper?
thank you for your reply. Yes. I did not understand how daccord correct long reads. I want to understand the daccord algorithm. Can you detail the algorithm please? thank you
any response please? Thanks
the software only for pacbio data ?
daligner: Block test.2 contains reads < 14bp long ! Run DBsplit what's wrong?
daccord should work for any kind of data loosely followng it's employed error model of randomly occuring errors.
Hello, can you present us how we correct long reads using daccord? i must just run
./src/daccord reads.las reads.dam
? and how generete reads.las and reads.dam ?thanks