I think actually instead of just counting the elements > 0.5 in the matrix we should do a mean. Mean would be affected by outliers which is actually what I want, since days with high correlatives is indicative of a big news day. The biggest news tends not to have a whole lot of creativity in the headlines.
Also, since all numbers in the matrix are normalized between 0.5 and 1 there's not like some extreme outlier problem.
I wonder if the metric is poorly defined still.
I think actually instead of just counting the elements > 0.5 in the matrix we should do a mean. Mean would be affected by outliers which is actually what I want, since days with high correlatives is indicative of a big news day. The biggest news tends not to have a whole lot of creativity in the headlines.
Also, since all numbers in the matrix are normalized between 0.5 and 1 there's not like some extreme outlier problem.
So mean of # > 0.5 / size of matrix?