lamho86 / phylolm

GNU General Public License v2.0
30 stars 12 forks source link

Taxon-specific measurement error #20

Open wrshoemaker opened 5 years ago

wrshoemaker commented 5 years ago

Hello,

I'm a big fan of the package and have enjoyed using it for a number of projects.

I had a question regarding the measurement_error optional argument that can be passed to phylolm. The documentation suggests that the variance of the measurement error is inferred and then included in the full covariance matrix. Is it possible to supply a string containing variance of the measurement error for each taxon so that the full covariance matrix includes taxon-specific measurement errors?

Best, Will

lamho86 commented 5 years ago

Hello Will,

The current implementation assumes that the variance of measurement errors are the same for all species. So, this quantity can be estimated from the data.

You want to supply a vector of variances of the measurement error. Does that mean you assume that they are known?

wrshoemaker commented 5 years ago

Hi Lam,

Thank you for responding. Basically my species-level observations represent the mean of means, so for each species-level observation I have a pooled variance term that takes into account the variation of biological replicates and the technical replicate variation for each biological replicate. My understanding is that I can supply a dataframe containing the biological replicates for each species and set measurement_error =TRUE and phylolm will estimate the error term. But I would like to take into account the technical replicate variation for each biological replicate.

My thought is that I could pass the pooled variances as a vector. This may not be optimal, but it would allow me to take into account measurement error.

lamho86 commented 5 years ago

If I don't misunderstand, you would like to fit a model where the variance of a specie i is:

sigma^2 v_{ii} + sigma^2_err + sigma^2_rep_i

where sigma^2 v_{ii} is the variance of the model (need to be estimated), sigma^2_err is the variance of the measurement errors (need to be estimated) and sigma^2_rep_i is the variance of the replicate (input by you).

If it is correct, I can add this into phylolm. It shouldn't take too long.

wrshoemaker commented 5 years ago

Apologies for the delay in getting back to you,

Yes, sigma^2 v_{ii} + sigma^2_err + sigma^2_rep_i is a good representation. Thank you for incorporating it.