TimoLassmann / kalign

A fast multiple sequence alignment program.
GNU General Public License v3.0
128 stars 28 forks source link

--type removed from kalign3 #38

Open Rohit-Satyam opened 1 year ago

Rohit-Satyam commented 1 year ago

Dear Developer

Now that --type parameter has been removed do I need to set the DNA parameters given in documents manually?

dna : default DNA parameters
5 match score
-4 mismatch score
-8 gap open penalty
-6 gap extension penalty
0 terminal gap extension penalty

or should I go with the default

--gpo              : Gap open penalty. [5.5]
   --gpe              : Gap extension penalty. [2.0]
   --tgpe             : Terminal gap extension penalty. [1.0]

If so how do I add match mismatch scores?

Rohit-Satyam commented 1 year ago

Using the conda to download the kalign and it appears even though this was updated 1 hr ago the version is old

Kalign (3.3.2)

Copyright (C) 2006,2019,2020,2021 Timo Lassmann

This program comes with ABSOLUTELY NO WARRANTY; for details type:
`kalign -showw'.
This is free software, and you are welcome to redistribute it
under certain conditions; consult the COPYING file for details.

Please cite:
  Lassmann, Timo.
  "Kalign 3: multiple sequence alignment of large data sets."
  Bioinformatics (2019) 
  https://doi.org/10.1093/bioinformatics/btz795
Rohit-Satyam commented 1 year ago

I compiled kalign locally for the time being until the conda package get updated.

Rohit-Satyam commented 1 year ago

I observe that kalign log information is also printed in the output file

Kalign (3.3.5)

Copyright (C) 2006,2019,2020,2021 Timo Lassmann

This program comes with ABSOLUTELY NO WARRANTY; for details type:
`kalign -showw'.
This is free software, and you are welcome to redistribute it
under certain conditions; consult the COPYING file for details.

Please cite:
  Lassmann, Timo.
  "Kalign 3: multiple sequence alignment of large data sets."
  Bioinformatics (2019) 
  https://doi.org/10.1093/bioinformatics/btz795

[2023-05-12 20:33:07] :     LOG : Detected DNA sequences.
[2023-05-12 20:33:07] :     LOG : Read 899 sequences from results/drep/drep.fasta.
[2023-05-12 20:33:07] :     LOG : CPU Time: 0.22u 00:00:00.22 Elapsed: 00:00:01.00
[2023-05-12 20:33:07] :     LOG : Calculating pairwise distances
[2023-05-12 20:33:08] :     LOG : CPU Time: 6.23u 00:00:06.22 Elapsed: 00:00:01.00
[2023-05-12 20:33:08] :     LOG : Building guide tree.
[2023-05-12 20:33:19] :     LOG : CPU Time: 10.69u 00:00:10.68 Elapsed: 00:00:11.00
[2023-05-12 20:33:19] :     LOG : Aligning
[2023-05-12 20:36:49] :     LOG : CPU Time: 557.09u 00:09:17.08 Elapsed: 00:03:30.00
>AY593798.1
------------------------------------------------------------
------------------------------------------------TTGAAAGGGG-G
-C---G--CTA-G-GGTCTCA-CCCCTAG---CACGCC---A--ACGACAGCTCCT-GCA
T-TGCACTCCAC-ACTTACGTCTG-TGCAC-ATGCGGGAA-CCGCTGGACTATC-GTTCA

This isn't expected right?