Per POS / rel
Counts
- [x] relative frequency of a pos = pos count / all pos counts
- [ ] coverage = the proportion of all sentences containing the rel/pos
Branch counts
- [x] relative number of branch patterns = branch patterns / all branch patterns
- [x] proportion of branches to the left of the node = left / all branches
- [x] proportion of branches to the right of the node = right / all branches
Branch distribution stats
- [x] mean = all the numbers in the set / the amount of numbers in the set
- [x] median = the middle point of the number set
- [x] variance = measures dispersion within the data set
- [x] standard deviation = measures spread around the mean
- [x] range = the difference between the 75th and 25th percentile of the data (similar to std but more robust against outliers)
- [x] skew = horizontal position of the tail
- [x] kurtosis = vertical position of the tail
- [x] entropy = degree of randomness (diversity)
- [x] anova = comparison of two distributions