Closed evandiego83 closed 6 months ago
Dear @evandiego83,
1). If you specify more than 2 groups, contrast-fel
runs
1.1 All pairwise tests
1.2 An "omnibus test", i.e. a test with β1 = β1 = ... = βN as the null vs the alternative where all β are independently estimated.
For example, using the data I attach in the example, you can run
hyphy contrast-fel --alignment COX3hyPhy.fasta --tree Tree-annotated.nwk --code Invertebrate-mtDNA --branch-set birds --branch-set mammals --branch-set Leucocytozoon --branch-set Haemoproteidae
which will report the following
### **7** tests will be performed at each site
...
| Codon | alpha | beta | substitutions | test |LRT p-value|Permutation p-value|
|:------:|:--------------:|:----------------------------:|:----------------------------:|:--------------------------------------:|:---------:|:-----------------:|
| 6 | 1.549 | 0.000 - 0.875 | 3, 3, 5, 5 | overall | 0.0147 | 0.5000 |
| 6 | 1.549 | 0.590 : 0.584 | 3, 3 | Haemoproteidae vs birds | 0.0395 | 0.5000 |
| 6 | 1.549 | 0.584 : 0.095 | 3, 5 | birds vs Leucocytozoon | 0.0052 | 0.5000 |
| 6 | 1.549 | 0.000 : 0.095 | 5, 5 | mammals vs Leucocytozoon | 0.0478 | 0.5000 |
2) "Internal" and "Terminal" branches are built-in sets.
You can just do something like (using this dataset)
hyphy contrast-fel --alignment tests/data/HIVvif.nex --branch-set "Internal branches" --branch-set "Terminal branches"
...
### Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model
* Log(L) = -3487.10
* non-synonymous/synonymous rate ratio for *internal* = 0.5103
* non-synonymous/synonymous rate ratio for *leaf* = 0.8196
### For partition 1 these sites are significant at p <=0.05
### For partition 1 these sites are significant at p <=0.05
| Codon | alpha | beta | substitutions | test |LRT p-value|Permutation p-value|
|:------:|:--------------:|:----------------------------:|:----------------------------:|:--------------------------------------:|:---------:|:-----------------:|
| 31 | 1.346 | 6.081 : 0.000 | 7, 1 | leaf vs internal | 0.0257 | 1.0000 |
| 33 | 0.772 | 1.279 : 13.980 | 3, 6 | leaf vs internal | 0.0053 | 1.0000 |
| 92 | 1.860 | 6.041 : 0.000 | 10, 1 | leaf vs internal | 0.0359 | 1.0000 |
| 109 | 0.000 | 6.744 : 0.000 | 4, 0 | leaf vs internal | 0.0382 | 1.0000 |
| 192 | 1.282 | 0.000 : 2.773 | 0, 1 | leaf vs internal | 0.0166 | 0.3333 |
### ** Found _5_ sites with different _leaf vs internal_ dN/dS at p <= 0.05**
### ### False discovery rate correction
There are no sites where the overall p-value passes the False Discovery Rate threshold of 0.2
Best, Sergei
Thanks @spond for this qucik response. This seems to be running now with those commands!
However, If I have only 1 unlabeled branch and all the remaining branches belong to each of the genotypes does that affect any of the analysis in terms of having only 1 branch in the background? Would it reduce any statistical power? Hope that makes sense.
Many thanks! Evan
Dear @evandiego83,
Unlabeled branches (for contrast-fel) are "nuisance"; they won't contribute/detract much from the analyses. Power comes from having more branches/data in the test groups.
Best, Sergei
Thanks!
Hello Sergei and hyphy colleagues,
Thank you for such a great piece of software and for continually improve it.
I wish to compare the selective pressures between four different viral genotypes and wanted to use contrast-fel to test among these different branches comprising all 4 groups. Is contrast-fel restricted to simply comparing two groups or is there a way to automatically do the pairwise comparisons of all four different groups in contrast-fel? If so how would this be done?
Also what would be the best solution to estimate dn/ds for internal vs. external branches? Would it best to select each of these as test and reference branches and run FEL or some other method?
Many thanks in adavance.
Evan