ericgoolsby / Rphylopars

Phylogenetic Comparative Tools for Missing Data and Within-Species Variation
28 stars 11 forks source link

Singular system on large data-set with a lot of overlap #59

Closed evangorstein closed 4 weeks ago

evangorstein commented 1 month ago

I was hoping to use phylopars() to fit a two-dimensional Brownian motion model along a tree with ~5000 tips.

A plot of the two-dimensional species "traits" is shown below: image And the actual tree image (Colors in the two plots correspond)

When I fit the model, I get the following messages:

warning: solve(): system is singular; attempting approx solution

warning: solve(): system is singular; attempting approx solution
Error: Not a matrix.

However, if I sample the tips of the tree (with ape::keep.tip()) to create a thinned out tree with only 750 tips, then phylopars() runs without error on this smaller dataset. The downsampled tree and trait scatterplot are visualized below: image image

Do you know what the source for the matrix singularity and the resulting error on the larger dataset is? One possibility that I checked was if the larger tree has any pairs of external branches, which are sisters and which are both (near) length 0, but the error remains if I get rid of such pairs of short sister external branches from the large tree.

ericgoolsby commented 1 month ago

The singular matrix can happen from any tips with zero or near-zero branch lengths. You could check for that with something like sort(tree$edge.length[which(tree$edge[,2]<=length(tree$tip.label))]) The relative scaling matters too, so maybe check to see what the shortest tip length is divided by tree height.

Other possibilities: do you have any species with more than one observation, or missing data?

evangorstein commented 1 month ago

Ahh, getting rid of the short external branches did the trick! Thanks so much!