medvedevgroup / SibeliaZ

A fast whole-genome aligner based on de Bruijn graphs
http://medvedevgroup.com/
Other
140 stars 19 forks source link

is it suitable for the very big genome? #39

Open AlisaGU opened 2 years ago

AlisaGU commented 2 years ago

Hi, I have a big genome with 46,139,523,234 bases and 20131 contigs. Here is the summary of the longest contigs.

ptg000004l      169819904
ptg000441l      158822330
ptg000279l      109104046
ptg000035l      107045360
ptg000669l      100503328
ptg000533l      90735505
ptg000066l      87495606
ptg000800l      85918877
ptg000855l      82319672
ptg000667l      80863498

I want to do pairwise genome alignment for it, and am curious about SibeliaZ's ability to handle big genome.

iminkin commented 2 years ago

Hi,

I think SibeliaZ should be able to handle it. However, it does not produce a pairwise alignment, it computes a multiple alignment instead.

debjit20504 commented 1 year ago

Hi @iminkin @dpryan79, im facing a similar type of issue while doing multiple sequence alignment of 8000 E.coli genomes (actually these are strains and each of them has multiple contigs in it) using SibeliaZ.

The bash script I've created seems to run into an error when trying to open more than 5000 files. As a result, the system terminates the script, making it difficult to perform the multiple sequence alignment on all 8000 genomes.

Im using SibeliaZ version v1.2.4

Please tell if you want to see my bash script or the exact error im facing. Thanks