Display-Lab / precision-feedback-pipeline

Apache License 2.0
1 stars 0 forks source link

Add some message history processing capabilities #195

Closed zachll closed 8 months ago

zachll commented 9 months ago

Esteemer currently doesn't process message history.

Proposal:

If the current candidate message template matches the message sent last month, set the message recency count to 1 (month). Apply this factor to the score.

If a message template was selected and delivered via email the previous month, Esteemer should score it significantly lower than other templates, so that it is only selected if no other message templates are available.

The downvoting should happen without regard to differences in measures, but if it is for the same measure, it should be downvoted more.

Don't do this now, but down the road we need to consider exceptions to this downvoting, for cases where continued improvement is motivating and receiving the same message about the same measure is a good thing.

mackgalante commented 9 months ago

Point 1

Point 2

If a message template was selected and delivered via email the previous month, Esteemer should score it significantly lower than other templates, so that it is only selected if no other message templates are available.

This likely requires both manipulations to the weight in the MPM as well as manipulations to the raw integers that we will be generating from message and measure recency. Setting the coefficient to -5, -10, -30, any sufficiently large negative integer might be needed, but we can tell for sure in testing. We can manipulate Message Recency with the following math so it starts large when the message is most recent, and decreases asymptotically so the 'downvote' effect decreases as time gets bigger. Bear with me here.


Recall history portion of the algorithm looks something like: data component + (X_t)(Message Recency) + (X_m)(Measure Recency) + (X_n)(Number Received<?>)

We can manipulate Message recency (below notated as X) like this: Message Recency term = X_t (e^(-X)) (X+1)^(-1))


Examples

Now we can evaluate this for the interval X = `Message Recency' = [0, ..., 4] to see how this impacts the overall rank:

Mull it over, I know it's tough to look at, but the implications are good! As message recency decreases, so does the overall rank, and with no recency the term is entirely ignored. We would only be evaluating this term once per candidate as well, so it's computationally efficient. We would likely want to use testing to determine a reasonable X_t value, depending on the median expected rank that a selected candidate will evaluate to. Maybe a change of -0.1 will move the needle, or maybe we need X_t to be -100 to make sure the needle is moving appropriately in the overall rank.

Point 3

The downvoting should happen without regard to differences in measures, but if it is for the same measure, it should be downvoted more.

We can use the same transform above on measure recency to get a similar effect, where more recent feedback on the same measure causes stronger decreases in the overall ranking. Because we have the measure and message recency terms separated, they run independently - it will downvote similar message templates, and do independent downvoting of repeated measures as well. Therefore, message and measure repetition will cause 'double' downvoting.

mackgalante commented 8 months ago

Objectives: 1) Pull in requisite data for calculations:

3) Compare row 'month' to t0, convert to integers representing distance from t0

mackgalante commented 8 months ago

Per team meetings, will be resolving this issue, opening new issue to describe remaining work for getting history processing online.