veg / hyphy-analyses

HyPhy standalone analyses
MIT License
38 stars 17 forks source link

Calculate global kappa #56

Open sophie-03 opened 1 week ago

sophie-03 commented 1 week ago

I am trying to obtain the global kappa value, I have found that I can do this in the command line using the interactive tool by selecting the following:

(5) Basic Analyses

(1) Analyse codon data with a variery of standard models using given tree.

(1):[Universal] Universal code. (Genebank transl_table=1).

GY94

(2):[Global] Model parameters are shared by all branches, branch lengths are estimated independently.

(2):[Proportional to input tree] Branch lengths are proportional to those in input tree

However, I have many alignments that I would like to calculate the global kappa for, therefore it is impractical to use the interactive tool for all of these. Is there a way I could automate this process? Or a different method I could use to obtain this value (and the global dn/ds value)?

For my analysis I only need the global kappa and global dn/ds, I have looked through the different analysis options here but as I only want these two values, I can't find the correct method to use.

Thanks in advance.

spond commented 1 week ago

Dear @sophie-03,

I would suggest something like

hyphy acd Universal /Users/sergei/Development/hyphy/tests/data/bglobin.nex MG94CUSTOMCF3X4 Global 010010 Y Estimate | awk '/^AC=/ {split($0, arr, "="); print "kappa="arr[2];};/^R/ {split($0, arr, "="); print "dN/dS="arr[2];} > bglobin.txt

The resulting .txt file will have

kappa=0.4921782859335548;
dN/dS=0.2705202781207747;

Change /Users/sergei/Development/hyphy/tests/data/bglobin.nex to the actual alignment, and the resulting '>' to wherever you need it to go.

I selected the models I would recommend (MG94CUSTOMCF3X4 and 010010, to specify the HKY85 component) but you can use GY94 as well.

Best, Sergei

sophie-03 commented 1 week ago

Hi Sergei,

Thank you so much for your quick response. I have tried the line of code you suggested but am getting some errors. This is the code I used: hyphy acd Universal /data4/smatthews/modelling_coevolution/branch_specific_omega/gene_alignments/10005_NT_AL.fasta MG94CUSTOMCF3X4 Global 010010 Y Estimate | awk '/^AC=/ {split($0, arr, "="); print "kappa="arr[2];};/^R/ {split($0, arr, "="); print "dN/dS="arr[2];}' > 10005.txt

and this is the error I am getting: Error:'/data4/smatthews/modelling_coevolution/branch_specific_omega/Y' could not be opened for reading by fscanf. Path stack: /home/smatthews/.conda/envs/model/share/hyphy/ /home/smatthews/.conda/envs/model/share/hyphy/TemplateBatchFiles/ in call to fscanf(PROMPT_FOR_FILE,"RawREWIND, ",treeString, ); '/data4/smatthews/modelling_coevolution/branch_specific_omega/Y' could not be opened for reading by fscanf. Path stack: /home/smatthews/.conda/envs/model/share/hyphy/ /home/smatthews/.conda/envs/model/share/hyphy/TemplateBatchFiles/ in call to fscanf(PROMPT_FOR_FILE,"RawREWIND, ",treeString, );

I also have the phylogenetic trees for my alignments, is there a way to also use this as an input?

spond commented 6 days ago

Dear @sophie-03,

Right, I assumed the tree was in the file. Replace 'Y' in the command line with the path to your tree string.

Best, Sergei