Regarding the structural model estimates, your explanation that polarization decreases semantic diversity by focusing the debate on a narrow range of the most hotly contested topics seems plausible to me/ agrees with my intution, so I am not surprised to see a negative coefficient estimate there.
However, I was surprised to see a negative coefficient for the effect of semantic diversity on overall quality. My (perhaps naive) expectation would be that an article with a larger number of issues being discussed in the talk page would likely be of higher quality since it examines more aspects of the topic. (Although perhaps there is a less charitable explanation in that the semantic diversity measure also picks up having multiple copyediting mistakes or similar non-ideological issues?)
As a resolution for this, perhaps Wikipedia’s learning model is “punishing” the quality score of articles with high semantic diversity because it thinks those articles should be broken up into smaller sub-articles? (If that is the reason, it seems conceivable -though unlikely- that the same estimation using a different measure of quality could have a different (positive) estimated sign, leading to a channel where polarization decreases quality - though the overall effect of polarization on quality would still be positive after considering all channels.)
To state my question more concisely: is it surprising to you that higher semantic diversity is associated with lower article quality? Why or why not?
Thank you for presenting!
Regarding the structural model estimates, your explanation that polarization decreases semantic diversity by focusing the debate on a narrow range of the most hotly contested topics seems plausible to me/ agrees with my intution, so I am not surprised to see a negative coefficient estimate there.
However, I was surprised to see a negative coefficient for the effect of semantic diversity on overall quality. My (perhaps naive) expectation would be that an article with a larger number of issues being discussed in the talk page would likely be of higher quality since it examines more aspects of the topic. (Although perhaps there is a less charitable explanation in that the semantic diversity measure also picks up having multiple copyediting mistakes or similar non-ideological issues?)
As a resolution for this, perhaps Wikipedia’s learning model is “punishing” the quality score of articles with high semantic diversity because it thinks those articles should be broken up into smaller sub-articles? (If that is the reason, it seems conceivable -though unlikely- that the same estimation using a different measure of quality could have a different (positive) estimated sign, leading to a channel where polarization decreases quality - though the overall effect of polarization on quality would still be positive after considering all channels.)
To state my question more concisely: is it surprising to you that higher semantic diversity is associated with lower article quality? Why or why not?