Open BrandonLMorris opened 8 years ago
I actually think that we should remove the "Solved by" section from the problem page. Instead, we should replace it with two parts.
I say this because it seems to me that the whole purpose of the "Solved by" column was for one of these two objectives, but it wasn't very good at either of them.
To extend the idea underlying (2), I would love to create a Profile page for users, that lists problems they've solved, results in competitions, etc. If that's not an open issue, it should be.
problem_data
. Alternatively, we could spawn a thread to do this every time someone made a submission and only update the problem that the submission was for, though I worry about performance impacts during competitions if we do it that way.I really like the idea of a profile page for users. I don't think it's currently an issue, but it should be. That would actually allow for what we are looking for here.
run.py
. It sounds like it would just be data crunching, so it would only have to access the database.I'm not sure what you mean by stateless, and I don't think adding a statistics daemon would break that. It would not be event-driven, if that's what you mean.
Before we let this go: Do you have any idea how the difficulty of a problem should be asserted? We could use and ELO based system like Kattis, though I'm not crazy about it, and we'd have to keep up with how many time a problem gets submitted to.
I guess you're right that this wouldn't add any state to the site. I'm not sure why I was thinking that.
I'm not sure how we should assess difficulty; I don't think that the Elo rating system would be bad. We actually do have data about how many times a problem gets submitted to, the entire submits
table. The thing we would have to do is count the good
, bad
, etc. submissions.
My reservation to the Elo is that our sample size seems really small (to be fair, I haven't read into Elo that much). A problem like Beautiful Mountains could be skewed because the only people person who would submit to it would likely get it right. That would give it an incorrectly low rating. Am I wrong?
On the other hand, a system that weights the number of users who have solved would skew new problems as being much more difficult. Sigh.
If we start counting good
and bad
submissions, I advocate that we discount good
submissions from a user who had already solved a problem.
That makes sense to me. Maybe we can take elements from that and devise our own. It's a difficult task, but I think that it's entirely doable.
I agree about discounting multiple good
submissions. It doesn't make sense to make a problem look easy if someone submits 100 correct submissions to Beautiful Mountains.
So some considerations when calculating the difficulty of a problem (in no particular order):
The difficulty with the old system is that it relies on data that may not always be available (or, TBH, convenient). If it gets used, it should be optional and weighed very little.
Did I miss anything?
I don't think you missed anything. I actually want to just drop the rating from the old system. It was a good metric to start with, but I don't think it's an extremely good indicator. It's not awful, I just don't think it's amazing. It's also just kind of annoying to compute.
No complaints here. I guess a follow-up would be: Do we want the new ratings to be how rankings are determined? Right now they are solely determined by number of problems solved. As our problem repertoire grows it may not be such a fair indicator.
I say yes, but maybe not initially. The switch shouldn't be hard after this gets sorted out.
Shouldn't be too difficult now that we have the
problem_solved
table.Not sure how we should get this to the client though. We could calculate it as part of an
/api/problems
request, though that could get very slow. We could also tag the value as a new column on theproblems
table (and only update it when a user solved that problem for the first time), though that would require a database change (which is gross).