Closed hansjovis closed 6 years ago
Note: @nataliashitova had a valid critique in that this should play nice when you have multiple keywords in your key phrase. In that case the "distance between keywords" gets ill-defined. If we implement this, we have to find a solution.
Closed as we actually adopted the Gini coefficient as a metric for keyword distribution (see https://github.com/Yoast/YoastSEO.js/pull/1789).
Summary
We advise users to distribute their keywords evenly throughout the text. The Gini coefficient can be used to measure the uniformness of a distribution. In our case, it can measure the uniformness of the distances between keyword instances. This will be a more accurate keyword distribution measure.
Explanation
We are measuring the uniformness by checking if the distance between any two keywords does not exceed a percentage of the total nr. of characters. This is currently set to 40%. This, however, has the disadvantage that it does not capture all instances where the distances between keywords are not evenly distributed. The Gini coefficient however has been developed to measure inequality in a distribution (specifically income inequality) so it would better reflect the notion of a uniform keyword distribution.
E.g.:
1. Uniform distribution. No inequality. Gini coef. of 0.
2. One outlier. Triggers "okay" score on assessment. Gini coef. of 0.228.
3. Keywords non-uniformly distributed. Triggers "good" score. Gini coef. of 0.28.