m-orton / Evolutionary-Rates-Analysis-Pipeline

The purpose of this repository is to develop software pipelines in R that can perform large scale phylogenetics comparisons of various taxa found on the Barcode of Life Database (BOLD) API.
GNU General Public License v3.0
7 stars 1 forks source link

Model test #34

Closed jmay29 closed 7 years ago

jmay29 commented 7 years ago

I was wondering if the function "modelTest" from the package "phangorn" was worth exploring for model selection in some parts of the pipeline. I had used it previously in a script to compare different models of DNA evolution for a phyDat object (a multiple sequence alignment, in our case) for building a ML tree. Here is a link to the documentation:

https://www.rdocumentation.org/packages/phangorn/versions/2.0.4/topics/modelTest

Although I just ran modelTest in RStudio on my data and it crashed ( 👎 ), so I will experiment a bit more with it!

sadamowi commented 7 years ago

Hi Jacqueline,

I think that, in principle, this is a good idea, but in practice I fear many of our datasets would be too large. If you would like to try this, then I think this would have to be at the step after selecting a centroid sequence per BIN. Also, I think we would want to run this for multiple groups and then select the most common best model to apply to all groups. We would want to use the same model for all runs so that we can directly compare results across taxa. One potential option would be to run this on multiple groups (with subsetting first for very large groups) and then find the most common model.

However, if you can't solve the crash issue, then I'd suggest that this be left aside from the current project and potentially be a component that you consider incorporating for your MSc.

Best wishes, Sally

jmay29 commented 7 years ago

Will do - I will keep on testing it out in the code and let you know if any progress is made! :)

sadamowi commented 7 years ago

I'd like to suggest to close this issue. Jacqueline - Perhaps you might make a note of this separately as something to consider for your work. Certainly, this is something you would want to consider for your phylogenetic pipeline.

Best wishes, Sally

jmay29 commented 7 years ago

Sure thing! Thanks Sally