Open ZeweiSong opened 4 years ago
Hi Zewei,
It's great to hear from you! This concerns shearing the reference sequences in database generation. It is an option to control the internal database chunk size. It should have no effect on the alignments to that database, but it may affect the database's size as well as alignment speed. Smaller shears result in better de-duplication, but in absence of known small duplicated regions in the input sequences, it may be better to set the shear higher (e.g. 1000-4000).
Cheerio, Gabe
On Tue, Aug 11, 2020 at 9:11 AM Zewei Song notifications@github.com wrote:
I was wondering what did the -s do to the sequence? Does it shear the input into specified length? Then how should burst deal with the sheared gap?
There is in the example -s is used alone, or as -s 1. It is a bit confusing to me.
Zewei
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/knights-lab/BURST/issues/28, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5NOBX5LARFHBLSHJFPTCDSAE7ORANCNFSM4P3AV7ZA .
@GabeAl To follow up on @ZeweiSong 's question, there is currently a line in the Readme,
- Run
burst -r MyDB.fasta -d DNA 320 -o MyDB.edx -a MyDB.acx -s 1 -i 0.97
to generate a database and accelerator.
where the option -s 1
is used. Is this a typo? Or is using -s 1
in fact recommended in some situations?
I was wondering what did the -s do to the sequence? Does it shear the input into specified length? Then how should burst deal with the sheared gap?
There is in the example -s is used alone, or as -s 1. It is a bit confusing to me.
Zewei