hsinnan75 / GSAlign

GSAlign: an ultra-fast sequence alignment algorithm for intra-species genome comparison
MIT License
51 stars 16 forks source link

Performance comparison between GSAlign and Minimap2 #9

Closed bellstwohearted closed 3 years ago

bellstwohearted commented 3 years ago

I am running some test according to your article. From table 2 I can see, for example, for SimHG-1X dataset, GSAlign takes about 11 minutes to finish alignment while Minimap2 takes about 37m.

As I have tested, GSAlign exactly takes 11m to finish, but if I run that for Minimap2

(base) root@5pbbd:~/GSAlign/dataset/hg38-1# minimap2 -a hg38.fa hg38-1.mut > alignment.sam
[M::mm_idx_gen::82.569*1.76] collected minimizers
[M::mm_idx_gen::102.864*1.99] sorted minimizers
[M::main::102.864*1.99] loaded/built the index for 25 target sequence(s)
[M::mm_mapopt_update::106.137*1.96] mid_occ = 704
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 25
[M::mm_idx_stat::107.642*1.95] distinct minimizers: 100128857 (38.78% are singletons); average occurrences: 5.526; average spacing: 5.581; total length: 3088286401

The command takes quite long time and it is still running. I cannot say exactly how long it will take, but definitely that will be much longer than 37m. Am I missing something?

hsinnan75 commented 3 years ago

It looks like you didn't assign thread number when you run minimap2. I used 8 threads for all selected tools. Please add "-t 8" when you run minimap2.

bellstwohearted commented 3 years ago

It looks like you didn't assign thread number when you run minimap2. I used 8 threads for all selected tools. Please add "-t 8" when you run minimap2.

Thank you for suggesting. I have tried, but it seems that the time consumption is still huge.

[M::main] Version: 2.17-r974-dirty
[M::main] CMD: /root/minimap2/minimap2 -t 8 -a hg38.fa hg38-1.mut
[M::main] Real time: 26746.863 sec; CPU: 75881.807 sec; Peak RSS: 120.938 GB
hsinnan75 commented 3 years ago

Oh, sorry. I forgot to mention that I also put "-ax asm10" to run minimap2. You can find the arguments I used in the supplementary data.

bellstwohearted commented 3 years ago

Thanks~ It works as expected.