mwpennell / geiger-v2

A suite of methods and models for studying evolutionary radiations
22 stars 17 forks source link

dtt with discrete characters error #31

Closed IanGBrennan closed 4 years ago

IanGBrennan commented 7 years ago

Hey there,

functions 'dtt' and 'disparity' seem to both throw the same error when presented with discrete characters using the 'num.states' index. version: geiger 2.0.6 error: Error in apply(data, 2, f) : dim(X) must have a positive length

steps to reproduce: tmp=get(data(geospiza)) td=treedata(tmp$phy, tmp$dat) geo=list(phy=td$phy, dat=td$data) gb=round(geo$dat[,5]) ## create discrete data
names(gb)=rownames(geo$dat)
dtt(tmp$phy, gb, index="num.states")

appreciate any help you can lend here. Cheers.

Suissajacob commented 4 years ago

Has this been addressed? I am receiving the same error using similar code.

lukejharmon commented 4 years ago

I've spent a little time with this issue this morning, and have the following to report. First, the function now works for the example given, as far as I can tell. I have also disabled both simulations and MDI for discrete characters. This is because the algorithm required for such a procedure - and its consequences and interpretation - have not been invented or investigated. Let me explain a bit:

For continuous characters, we can simulate data under multivariate Brownian motion to get the null distribution of dtt plots, and then use that to calculate MDI. Critically, as long as we get the trait correlations right, we should be fine - the overall scaling of the rates is not important. And, we have plenty of experience fitting multivariate Brownian motion to data.

For discrete characters, the problem is much more difficult. First, we need to specify a model for all of the characters, and the choice of model is not entirely clear. Second, the null distribution of the dtt will, I think, depend heavily on both the rate of evolution and the relationship among characters- making this problem quite complex.

In the original paper, we applied dtt to continuous characters. We now support plotting a dtt for discrete characters - but I think enough remains unresolved to consider this as a method that could be implemented with a lot of work, and can be left to the future. Or if someone wants to do it, I can help advise!