Open kyprizel opened 9 years ago
Yay! Love the idea.
Any Ideas how should it look like? Poll?
I think so. I would have some set time period after the event during which the teams could submit their suggestions for scores. I'd also make it transparent so that everyone could see each team's vote. That might help minimize "sour-grapes" voting where a team does poorly and thus votes extremely low.
On that same note, might be good to throw out the top and bottom outliers to prevent the overall average from being skewed by extreme votes on either side. Less sure about this though, just an idea.
Obviously if a team runs the event, they wouldn't be able to submit to the poll.
So, only teams participated in the event can vote? Then we get a situation when some crappy CTF was played by 10 teams and they rated it to 100 :)
No, I'd let any of the top 10 or 20 or whatever the consensus is have voting privs on it. One reason you let them vote for the next week or two -- so even teams that didn't play can read write ups and check out the problems after the fact even if they didn't play.
Another related topic would be a good write up from you as to the guidelines for how things are scored now. A rubric of sorts. While different teams will always have slightly different perspectives (and indeed, that's part of the point of this discussion), having a better understanding of why you've scored things as you have so far would be good. Is it just difficulty, is it "quality". If we are really going nuts, these could be seperate axis that are then combined somehow into an overall score (not necessarily evenly weighted):
Difficulty: 1-10 Fun: 1-10 Quality: 1-10 Breadth: 1-10
Don't know that those are the right metrics, but whatever the community wants to incentivize is what we should measure. Organizers will make design decisions based on that criteria which is a powerful thing.
Also, maybe "organizer communication" as another category. Again, just brainstorming. Perfectly happy just having the scores we have now too.
To play the devil's advocate here:
I would propose a similar but different solution: YES, let the top10-20 teams vote on the rating. The voting deadline is 12h after the CTF starts (for 48h CTFs; less if it's a shorter CTF of course). 12h gives you enough time to see if the tasks are well made, well tested and everything is stable. But it doesn't allow you to predict how well will your team do* or who will win - one bias less.
I would still be happy if a main admin would set the initial rank based on previous years of the given CTF, and the top10-20 teams would be allowed to change it e.g. N% down to N% up (e.g. N=50 or 75).
I think we should somehow formalize the rule. Something like I do now - In the first year CTF can't get rating more than X, or even has X by default and it can be increased regarding on factors like:
Should we require the team voting to also be registered for the event? That might also help promote the automatic integration between ctftime and the events.
Midway through the event isn't a bad idea, but I'm not totally sold. I feel like plenty of problems might not have revealed themselves by then, and many events save their best challenges for particular times. I guess this would certainly incentivize people to release their best challenges first which might not be too terrible.
I think people will be a bit more honest in their votes if they are made public. Hopefully people would want to avoid the public shaming of voting poorly because they across poorly when it would be so obvious, but maybe I'm wrong.
I might actually be ok letting /every/ registered ctftime team (who is also registered in the event) vote and scale the power of the vote based on the ctftime placement in some way. Hopefully the effort required and the relatively minimal Impact overall disincentivizes anyone from spamming teams to impact event scores, and again, it would be obvious if someone did it in a meaningful way I would think so hopefully easy to correct.
Also, I agree some sort of public formula is the way to go, but I wouldn't include number of teams that played directly. I might let the number of teams that played increase the size of the possible points to allocate based on the votes. Does that make sense? If twice as many good teams play, but all rate it poorly, it probably shouldn't inherently be a hire ranked event. Between two events that players rate the same, the one with more players /should/ then have a higher score.
I think we could do automatic weighting of CTFs based on relative performance. Voting is going to be bombarded.
For example, events in which some number of DEFCON finalists did not even compete in would have a lower point value. This would make DEFCON the highest point-value event, and quals a close runner-up.
Additionally, being a qualifying event could carry weight in and of itself.
If we permit voting, I think it could be per-team or restricted to the top N teams per event, and be strictly relative. For example, instead of "Would you rate this on a scale of 1-10", "Was this better than event X". You end up with a strict heirarchy of "good teams say this is the ordering of events". You can then infer score at the end of the season, or use the generated data to set the score for the next season.
Also I think there should be a clear policy about rating of events with special rules. For example the finals of 0CTF (https://ctftime.org/event/215) are only accessible by chinese teams (by organizer's decision, not because teams for abroad don't want to travel). I think competitions that are not accessible by everyone but have any kind of restrictions should be rated 0.
are only accessible by chinese teams
in every country there are such a competitions and most of them have the same weight
Just to clarify, CSAW is a university-only event, but not limited to US teams. They don't cover full costs for teams from outside the U.S., but they are still invited to attend. There have been Canadian teams who have played, for example.
That said, even the restriction on university only is similar in my mind.
That /might/ be a special case where it's worth having a flag for "college/university only" so they could have a seperate scoreboard for those teams.
As for ones restricted by nationality, I agree, they have no business being ranked on the global ctftime.
Finals teams are limited to the US and Canada but is open to discussion on a case by case basis
:)
I'll double check, but I thought that was only referring to covering the full costs of the trip. Could be wrong though. Summoning @coldheat !
We cover travel for teams from US and Canada and the scoreboard listing those who qualify only show US + Canada. Nothing international just yet but we have had discussions on bringing specific international teams before which is why it's case by case. But the ruling this year is nothing international.
Also just my two cents, I think rating relative to performance of the teams would be ideal. Voting would introduce too much gaming of the system.
Say top ten teams have only 33% total points earned; you could probably say that the CTF was hard. Also for restrictions on who can play why not include a popularity metric that denotes how many teams on CTFtime played the CTF?
We added voting for event weight. For the first time we'll just use it as an reference but soon weight for the most events will be calculated automatically. Algo suggestions appreciated.
Current ideas:
A lot of people say that rating weight is non-transparent at the moment. What if we allow last years top 10 (20?) teams to decide event ratings?