I've arguably gone down a bit of a rabbit hole here, but exploring tree balance indices turned out to be quite interesting. I've suggested a few other edits in the code that I made as I went through it, which you are free to ignore, but the easiest way I could find to provide these suggestions was as a single pull request.

On tree balance indices

I lost the notes I'd previously assembled, but briefly:

Mir et al. emphasize that the TCI has a larger range of values than Colless’ and Sackin’s, and thus a greater resolution power.
I also came across Lemant et al. 2022 10.1093/sysbio/syac027, who emphasize the dependence of the values on n. They highlight that the TCI considers a caterpillar tree on many taxa equally imbalanced to one on say four taxa, which they assert (convincingly, I think) is undesirable.
- Lambert et al. propose a tree balance index that is better still. I couldn't resist integrating their implementation into TreeTools. This has the more tangible benefit of being intrinsically normalized.

I wrangled a bit with the most appropriate way to normalize the TCI. Normalizing with respect to the maximum possible value: looks very different to normalizing against the expected-from-uniform-tree value: The J1 index provides more pleasing results; the bar looks much more symmetrical, and less dispersed: (This axis should read symmetry, not asymmetry)

Potentially normalization is having a more substantive impact here than the actual choice of metric, but this looks like a good argument for using the J¹ index from Lemant et al.

There's some interesting potential for followup work as this index can also take into account population sizes when computing tree balance – data that you'll have available in TrevoSim!

Notches on bar plots

Notches help a viewer to determine whether a difference between medians is likely to be significant (even if their strict statistical interpretation takes a bit of explaining).

Extra steps

Your extra steps calculation always assumed that the minimum number of steps possible for a character was one, which is not the case for uninformative (0) or multistate (>1) characters.

If any matrices have ambiguous tokens, it's not clear that the previous code would handle these correctly (and I'm not sure the new code will either).

Using a linear scale to show characters with no extra steps seems worthwhile. It's surprising that so few characters have no extra steps. Might this number be inflated for simulated data if TREvoSim includes more uninformative characters than empirical datasets would typically include?

User friendliness

I suggested a couple of changes that might help users. The cli library is a nice way to provide progress indicators without flooding a user's console. And it might be possible to work without the non-standard package vapoRwave.

Thanks for all of these suggestions, much appreciated, and the R script is much stronger as a result. I suspect this will be the first of a few comments:

-- Your extra steps calculation always assumed that the minimum number of steps possible for a character was one, which is not the case for uninformative (0) or multistate (>1) characters.

This function was courtesy of @evo-palaeo, and so I can't claim to remember the thinking behind the choices.. But, as written, I think this function assigned NAs to uninformative characters (hence the -1 in there you comment on), and TREvoSim data doesn't have multistate characters. Thus if I'm following correctly, the difference between this and your suggestion, is whether the graphs show the number of extra steps in only parsimony informative characters, or whether this should include uninformative characters. I guess that then boils down to whether it is meaningful to talk about extra steps for uninformative characters, which can't have these - or whether that might be better graphed separately. I'd welcome your thoughts, and those of @evo-palaeo if he has an opportunity to provide them.

Thanks again, this has been a really useful exercise - I ended up going through your additions line by line and then transferring them over manually, as this helped me get my head around them (and no dbout improved my R coding as well..).

-- I've arguably gone down a bit of a rabbit hole here, but exploring tree balance indices turned out to be quite interesting.

This is an ineresting rabbit hole, and I have read your comments and the papers you link to, with interest! Thanks for all the effort you have put into this: I have added acknowledgements to the script to reflect your input (I still need to deal with the broader contributions framework for the repository, but will remember this when I get to that).

-- Potentially normalization is having a more substantive impact here than the actual choice of metric, but this looks like a good argument for using the J1 index from Lemant et al.

I agree, I'm surprised at the impact that normalisastion has compared, and given the intrinsic normalisation of J1, I will add this to the documentation with a not to look for discussion here.

-- There's some interesting potential for followup work as this index can also take into account population sizes when computing tree balance – data that you'll have available in TrevoSim!

This sounds fun. It is also really interesting in the context of empirical trees, which are often not just asymetrical in terms of diversity, but also abundance.

-- Notches help a viewer to determine whether a difference between medians is likely to be significant (even if their strict statistical interpretation takes a bit of explaining).

Now included in all.

-- Your extra steps calculation always assumed that the minimum number of steps possible for a character was one, which is not the case for uninformative (0) or multistate (>1) characters. If any matrices have ambiguous tokens, it's not clear that the previous code would handle these correctly (and I'm not sure the new code will either).

See discussion above. The latest version of the script includes your suggested modifications to this function. This is only used for TREvoSim data so shouldn't include ambiguous tokes.

-- Using a linear scale to show characters with no extra steps seems worthwhile. It's surprising that so few characters have no extra steps. Might this number be inflated for simulated data if TREvoSim includes more uninformative characters than empirical datasets would typically include?

Yes, further to the above comments graphig with zero does show up this surprising fact. It's intersting. With regards to the TREvoSim data, this will - of course - have a difference, but I see that the graph with uninformative characters:

Is pretty similar to that when I have set them to be stripped out:

-- I suggested a couple of changes that might help users. The cli library is a nice way to provide progress indicators without flooding a user's console. And it might be possible to work without the non-standard package vapoRwave.

Thanks, all have been incorporated in the latest version

palaeoware / trevosim

Review empirical data comparison #53

On tree balance indices

Notches on bar plots

Extra steps

User friendliness