HedvigS / SH.misc

GNU General Public License v3.0
0 stars 0 forks source link

Phylo.d wrapper #2

Closed HedvigS closed 9 months ago

HedvigS commented 10 months ago

Function for doing some sanity checks of data before doing caper::phylo.d analysis.

skalyan91 commented 9 months ago

I’ve cleaned up the code a bit; most importantly, I’ve made the binvar argument a raw column name rather than a string (for greater continuity with the original phylo.d() function). To do this, I used a trick from this gist, which you might remember!

I have an additional suggestion: All the columns you’ve created in the example script (with the exception of cluster and cluster_random) are intended to test the behaviour of phylo.d_wrapper() when the distribution of the feature is highly skewed. It would be good to have columns that trigger the other conditions in your function, as well. Unfortunately I’m not familiar enough with phylogenetics to be able to easily create such examples; maybe we can leave it as a future task.

HedvigS commented 9 months ago

I’ve cleaned up the code a bit; most importantly, I’ve made the binvar argument a raw column name rather than a string (for greater continuity with the original phylo.d() function). To do this, I used a trick from this gist, which you might remember!

Oh okay, sure! As long as it works for apply/for loops all good. The solution I had to binvarwas one that the caper package maintainer suggested, but anything that works it okay with me.

I have an additional suggestion: All the columns you’ve created in the example script (with the exception of cluster and cluster_random) are intended to test the behaviour of phylo.d_wrapper() when the distribution of the feature is highly skewed. It would be good to have columns that trigger the other conditions in your function, as well. Unfortunately I’m not familiar enough with phylogenetics to be able to easily create such examples; maybe we can leave it as a future task.

Hm, yeah maybe. For now I think this'll be good enough :)

skalyan91 commented 9 months ago

Oh! I didn’t realise you need to loop over the binvar argument. I’ve changed it back to a string, rather than a symbol, and renamed the argument to var.name, to avoid confusion with the binvar argument of phylo.d().

I’ve also fixed an issue where the data argument of phylo.d_wrapper() wasn’t being passed on to phylo.d(), and added a ... argument, so that the remaining arguments of phylo.d() can also be used. (See https://github.com/HedvigS/SH.misc/commit/add36bb00d36b7b05474f07bcddeaebfd4856a59.)

HedvigS commented 9 months ago

Oh! I didn’t realise you need to loop over the binvar argument. I’ve changed it back to a string, rather than a symbol, and renamed the argument to var.name, to avoid confusion with the binvar argument of phylo.d().

I’ve also fixed an issue where the data argument of phylo.d_wrapper() wasn’t being passed on to phylo.d(), and added a ... argument, so that the remaining arguments of phylo.d() can also be used. (See add36bb.)

Well, it's just in case it's necessary I'd rather build in that functionality now in the wrapper.

Can we do all significant changes to content here via PRs? Then we can review each other. I've been in a few git projects now, and committing straight to main can get out of hand I'm afraid.