Open markwhiting opened 1 year ago
I think we should have a simple way to switch which approach we use and probably implement at lease a purely random, a weighted random, and a simple model based version.
@amirrr let me know if you have thoughts on any of this.
Also @JamesPHoughton please chime in if you have thoughts you think would help us or any resources you might recommend us considering.
Currently using reverse weighted reservoir sampling with MySQL seems like a way to quickly get a statement
WITH weighted_questions AS (
SELECT
statements.id,
statements.`statement`,
1.0 / (COUNT(answers.statementId)+1) AS weight
FROM
statements
LEFT JOIN
answers ON statements.id = answers.statementId
GROUP BY
statements.id
)
SELECT
id,
`statement`,
-LOG(RAND()) / weight AS priority
FROM
weighted_questions
ORDER BY priority ASC
LIMIT 1;
Refrence: Randomly selecting rows based on weights
We want to build an algorithm that can quickly choose the next statement for someone to rate. There's a few things this algorithm could prioritize so we should consider which ones we care about most. There are also a few different levels of algorithm which imply dramatically different computational loads, so that should also be a consideration.
Possible optimization considerations:
And some of the levels of algorithm might be: