ctftime / ctftime.org

meta repository for ctftime.org issues, feature-requests etc
69 stars 1 forks source link

2017 Rating #40

Closed kyprizel closed 7 years ago

kyprizel commented 7 years ago

Hi, now, when people say that voting-based rating schema does not work - let's discuss how can we fix it.

Should we use ELO-like system? Something else?

kyprizel commented 7 years ago

@v0s, you had some ideas.

v0s commented 7 years ago

Well, i loved the Glicko2 experiment https://kitctf.de/glicko2/ I'm quite puzzled that it heavily depends on the parameters though... but perhaps it could be worked out by trial.

So 1. Foundation is Glicko2

  1. Degrade rating with time Say, every day every team's rating drops by 0.2%. That prevents a team to outpwn everyone numerous times, gain 1st place, and then not play and stay 1st forever. −0.2%/day gives a 50% decrease if a team doesn't play any CTF for a year straight.

  2. Button to play an event unrated Sometimes there are shitty events, or just events when everyone's on the team is busy and there are couple guys playing for fun. While it's useful to have the results listed on CTFtime, not giving full effort to these events can ruin team's Elo rating. Teams should be able to opt out of counting them in the rating for a specific event. [ 💩 Shitty CTF ] button should be accessible for a reasonable amount of time that a team can evaluate the challenges, but not too long so that a team can't drop out because today opponents pwn them. First quarter of event runtime (but with a minimum of first 3 hours) looks reasonable.

MathisHammel commented 7 years ago

I definitely don't like the idea of an elo rating where we can lose ranks if we perform badly. I am in an academic team and it's hard to determine in advance how many players will come and how much effort they will put. Sometimes we start very slow but teammates who finish homework early join us and manage to put us in the top 50. Of course, it would not be impossible to anticipate how much time every team member can help for a CTF so we can determine whether we play or not (or 💩 as @v0s suggested) but it would take a huge effort of planification for every single CTF. Also, this would prevent us from playing some CTFs which would make us miss some great CTFs. What I'd imagine is a rating system where if a team plays, however bad their score is for the CTF, they still gain more points than if they were not playing. The downside of that (which I feel is not that much of a downside) is that it would promote weak teams that play often compared to strong teams that rarely play.

Pharisaeus commented 7 years ago

As the Glicko2 experiment showed, it's not a good idea in my opinion. Mostly because it basically discourages from playing for the risk of falling down in ranking. This might also promote playing some shady ctfs no-one heard about, which appeared on ctftime after the event already finished.

Even the option to select the event not be counted in is not a good idea, because either it's available after the event and can be abused, or is available only before / shortly after starting when it's often difficult to decide if we're going to play seriously or not. We have a small team and for most of the events there are ~4 people playing. We usually can't tell before the even how seriously we will play, apart from a few large events we try to do our best.

If anything I would consider scoring CTFs based on number of top teams participating. In general this works because if a CTF is good, there are strong teams playing it, and if it's some noname indian-style CTF then a lot of teams don't bother with it. It also removes the option to abuse scoring by the top teams, because the only way to lower the score of an event is not to play, and then you lose points by not playing at all. It would be nice if someone made a statistics of last year events to compare the "score" of the event with percentage of let's say top 20 or top 30 teams in said event.

msm-code commented 7 years ago

Let me start with stating that I think that most important "mission" of CTFtime.org should be encouraging people to play CTFs. Especially people new to security.

Things like points, ratings, rankings, comments, votes, etc, etc are secondary. Or rather, they are just a means to an end (encouraging people to play CTF), not a goal itself.

And in my opinion ELO and similar systems may be discouraging to some people (negative feedback when losing, fear of performing badly, pressure, tensions in team, etc).

On the other hand, giving some points for every team is creating positive feedback, and does just what it should: encourage everyone to play and learn as much as they can! Yes, it's not perfect, it's not 100% fair when it comes to choosing "best" teams, but it works and it's the best we had.

waywardsun commented 7 years ago

It seems to me that there is not a system that cannot be gamed in some way. So, instead of a completely different system, we could improve the current system.

I think it would be good to do the following:

1) Cap the % of pts a CTF can increase in points from one year to another. This would also cap the % they could decline.

2) Make points earned based on the current point value of the CTF. All voting would impact the next year score of the CTF. This would prevent the winners from up voting it and the losers down voting it. At least that would be the hope. If there isn't an immediate gain for the top N teams, then I think the voting will be based on the actual quality of the event.

3) Brand new CTFs get 0 or 5 pts max. If they stick around, they go up. If they don't, then it doesn't matter.

4) Any CTF with less than 20 teams gets a max of 0-10 pts. This would help with the CTFs that come out of nowhere. I think first year for an already played CTF that wasn't listed should be 0 but the system shouldn't be overly complex.

5) Maybe add the idea of negative points (to a point...maybe -5) for CTFs that are just total rubbish. As it is now, there is really no incentive for a CTF to improve from one year to the next. They will increase in points because the top teams want more points.

The ideal system to me is one that makes it clear which CTFs are great and which are not as well as one that makes it clear that if you run a bad CTF, the standing will drop.

That way, the "strong teams" that don't want to play every CTF can skip the low point CTFs.

pedroysb commented 7 years ago

The idea of Pharisaeus makes total sense to me.

msm-code commented 7 years ago

Yeah, after reading Pharisaeus post, I agree that this idea makes sense (maybe with some tweaks).

nazywam commented 7 years ago

I'd say it looks pretty good, here's a list of last year's ctfs sorted by the amount of top #10 teams that played in an event.

http://pastebin.com/raw/qW9CiWnq

Almost all 10-7 events are known and well respected, although a slight modification might be needed for offline events.

kyprizel commented 7 years ago

What if there is some good CTF (well-organized with good tasks and TOP teams participated) running for the first-time? Will it still have 10pts? Or it should have manual "weight"? Now it can get 15-25pts at most via voting.

immerse commented 7 years ago

@nazywam I'd be curious to see that list with top 30 teams instead of top 10. Something to notice from that list: hack.lu 2016 had all top 10 teams playing, but it was much lower quality than previous years. The list does not capture this fact. What is the condition you are using for a team to have "played"? Is it just that they signed up? Maybe the list would be more telling if we only consider a team to have "played" if they got a certain rank or number of points.

I agree that Phariseus's suggestion makes a lot of sense.

Brand new CTFs get 0 or 5 pts max. If they stick around, they go up. If they don't, then it doesn't matter.

I don't think this makes sense. Some CTFs are recent but surprisingly good (e.g. 0ctf) while others are old but have dropped in quality (hack.lu, codegate). It would also be more encouraging for organizers if the initial score is based on quality somehow. (The tricky part is how to assign the initial weight.)

v0s commented 7 years ago

@Pharisaeus interesting idea to calculate event weight based on number of top teams. How to prevent it from turning into an "oligarchy"? Looks like it's making favor to top teams (they play a CTF → it becomes high-weight → they gain more points for it and become even more "top")

ngg commented 7 years ago

We could steal some ideas from ELO-like systems like what topcoder uses. It would really be dismotivating when your score decreases just because you try to play a CTF without your whole team or when you don't have a lot of time. But the scores should increase based on which teams played the CTF. If you win a CTF against the best teams then it should increase by a lot.

v0s commented 7 years ago

I won't say it publicly, but i personally don't like the current meaning of CTFtime top as "who plays more single CTFs" and not "what team will typically win in a contest". Just compare year 2016 for dcua (1st place with 1625) and Dragon Sector (the runner-up with 1435 — 10% less).

oops, i said it publicly

Still, i agree about stimulating new teams to learn (play as many CTFs as they can). Maybe we can work out a way that solves both needs for objectivity and motivation?

Pharisaeus commented 7 years ago

@ngg by this is exactly why I don't like ELO -> teams will consider not-playing at all if they are not full strength. ELO works based on assumption that every time you play, you do your best, and this is simply not true for CTFs, especially when there is more than 1 CTF per week nowadays. If you look at top teams you will see that each one of them has multiple CTFs finished far behind what you would expect from them, and I'm pretty sure that in such cases there were just very few people available to play.

The idea I proposed earlier might not be the silver bullet (it would have to be somehow modified for on-site finals maybe, considering there is generally very few teams there at all), but it does have few strong points:

There is a risk here, that a top team will not play a new CTF, so we might end-up with a good event with small score. But it's already like this anyway, since currently new CTFs could not get high scores either! Also maybe event creators should provide something like "target score" they aspire for? This way teams would know what they can hope to expect at least.

@v0s I doubt there is any risk of oligarchy for a few reasons:

niklasb commented 7 years ago

I wrote https://kitctf.de/glicko2/, and I have tried standard ELO and TopCoder ELO before Glicko-2. I think none of these are really working for CTFs to be honest, for the reasons stated by @Pharisaeus and also just because there are not enough data points to get a stable rating. Just try tweaking the initial values slightly and watch how the scoreboard changes drastically. Also note that the implementation for 2016 only considers the top 20 for each event, so as to not brutally punish teams that didn't play with full effort during any of the CTFs. This couldn't be done in 2017 because it would lead to some very weird dynamics.

I also think that "increase-only" ELO like @ngg suggests is not really possible, but would be happy to be proven wrong by somebody with more experience in statistics. If there is a good way to implement it that I don't know about, I think that would be a nice solution.

bl4de commented 7 years ago

Why all CTFs can't have exactly the same amount of points available?

My idea is as follows:

Let's assume there is 1000 points limit/pool/gap/whatever. Teams gain points for global ranking on CTFtime proportional for scoring in particular CTF:

Best classic CTFs collect a lot of teams, so even if team finish as 100th or 200th - still gets good points. And for the winner 1000pts is still well deserved, exactly how it works right now. It will encourage best teams to play even less advanced CTFs (because they will have to play to get points and finally place in top teams of the season) - and will work exactly in the same way for less experienced teams. Every team and CTF will be equal.

From the other point of view - weak CTF with only couple of teams playing won't give you good points, because less teams means less points even for place in Top 10. Winner still gets 1000pts, but hey, if team wins CTF, even weak, they are still the winner, right? :)

And it will be a natural way for ANY CTF to be considered as a good one - number of teams playing. For CTF newcomers - first year will be the hardest, but from my own experience - it's just a mater of couple first challenges to know if CTF is hard, weak, interesting, boring, screwed up totally or is it something really hard, exciting and forcing to learn a lot.

Finally, there will be only one way to be All-Season Number One - playing and winning. Winning as many CTFs as possible - is there a better way to encourage people to play CTFs? :)

It also prevents from the situation, where really good team wins 4-5 CTFs with the highest weight and ended up in top 10 not playing any other CTFs. Although those teams are still the best, no doubt, it's rather weird situation IMHO.

Last thing - there are still very good CTFs finals played only on-site. This will prevent in natural way from situation, where average or weak teams will get high position in ranking only by winning mostly easy CTFs - because there's no way for such teams to get into eg. DefCon or insomni'hack finals. Maybe it's worth to consider increasing point limits in such cases, because if CTF is played as on-site final, it's likely one of the best ones. So we still keep some CTFs as "elite".

Pharisaeus commented 7 years ago

@bl4de sorry but this makes absolutely no sense :) It would basically make it even worse, promoting not only teams who play a lot but also teams who play a lot bad CTFs. The problems we're trying to solve are:

What you propose solves none of those, and in fact it makes the first point way worse than it's now. The whole point is not to score "winning as many CTFs as possible"!

Also you want to prevent teams playing rarely but winning hard competitions to be in the top, which is a bit weird, because I believe this is what the ranking is actually for. Not to show who plays a lot but to show who is really strong.

Your last point is not really valid since there are only a handful of on-site CTFs, significantly less than some shit-tier competitions, and in your case winning some 3h noname CTF will be equal to winning DefCON and those teams would be tied in the ranking. How does this make any sense at all?

Coming back to my initial idea - there is one problem I can see. Often before the event we can't be sure if it's going to be good or not, so it might happen that all top teams register and send some initial flags and only then it turns out that the CTF is no good at all. This would still count as "many top teams played", so I guess there would have to be some threshold as proposed by @immerse to consider a team actually playing (such as finishing with more than 10% of top 1 score? or finishing in the top 10%?).

bl4de commented 7 years ago

@Pharisaeus That's a lot of words for just 'WTF are you talking about?' :) You're very kind, thank you :)

pedromigueladao commented 7 years ago

Hi all,

First of all thanks for putting this up to discussion. I run an academic team and we have been discussing this among us for the past year. We were in fact giving up on the CTF ranking as it was going crazy with 1+ CTF/week and all rated very high. It was impossible to keep up to speed with this.

Now about rating:

Proposal:

  1. name our Grand Slam CTFs - 70 points (all DefCon Qualifiers?!?)

  2. name our CTF-20 and CTF-5 competitions

  3. your rating would be your GrandSlam Points + N-best CTF-20 competions + M-best CTF-5 competitions

  4. Previous year's top-50 teams would grade NEW CTFs as CTF-20 or CTF-5.

  5. After each CTF, all participants would vote a promotion to CTF-20 or a demotion to CTF-5 applicable the year after.

  6. From my experience I would consider appropriate 8 Grand Slams, N=5 and M=5. It is about 2/month/school-year.

Advantages:

MathisHammel commented 7 years ago

@pedromigueladao pretty good system, but it would discourage us to play some CTFs when we know we're not going to beat our n-th best position.

A similar system (without a hard cut on a determined number of CTFs but with a smooth decreasing curve instead) is the one used by kattis. More info on https://open.kattis.com/help/ranklist about the ranking which would need to be adapted. The "combined scores" formula has two nice things : -"Adding a new user [CTF ranking for us] with a non-zero score to a group [CTF team] always increases the combined score" -"About 90% of the score is contributed by the 10 highest scores" (the parameter is adjustable)

Pharisaeus commented 7 years ago

@pedromigueladao I though about similar idea to split CTFs into "tiers" and score them based on tier they are in, but there are issues here as well:

  1. There might be no consensus for splitting CTFs into tiers because you can't decide before the event (you never know if it's going to be good) and afterwards your choice might be influenced by the results. This was already present last year, where some teams (I won't point fingers...) clearly stated that they're voting for their own immediate benefits. So if team X won/finished high a certain event they will try to promote it to higher tier. You can see this even now in voting for some events last week. There were two very poor CTFs but they are still getting upvoted by teams who finished high. Conversely, if a team already "capped" a certain tier they might push to move a certain event to a different (maybe lower) tier just so they can accumulate points also from this one. So again there is a lot of "gaming the system" potential here.
  2. This idea of selecting only N, M events of a certain tier might discourage people from playing, or at least not encourage them to play. Right now each event you participate in gives a reward, and if we consider only a certain number of events of a kind, then people might decide it's not worth playing when there is no benefit.
  3. The constants of 8, 5, 5 seems a bit low considering there were >70 CTFs last year. This might result in numerous ties.

you cannot increase a lot your ranking by doing good in a lot of average competitions

This seems a bit like a disadvantage, because it does not encourage participation.

pedromigueladao commented 7 years ago

@Pharisaeus of course you can game any system we come up with.

How to score teams so that ranking shows "who is good" and not "who plays a lot"

if you do not put a threshold, you will end up confusing the two

How to score a CTF so it's easy to see which CTF is good or hard and which is bad or easy.

I do not see how to do it before-the-fact.

pedromigueladao commented 7 years ago

@kara71 you are not trying to beat your n-th best position. You are trying to beat your n-th best rating-score (that would be computed as it is now rating*your_score/winner_score)

MathisHammel commented 7 years ago

@pedromigueladao my bad, read that wrong. It seems inevitable to have a weight vote since a CTF's quality can only be observed by humans (until AIs take over the world). What I think would be a good system is have a weight voting similar to what's currently in place, but without the people voting extreme values (I admit I often tend to put the max rating). We could give a score to every team according to how close their ratings are to the final rating. The rating would be an average of everyone's vote weighted by their accuracy score. We should be careful however to balance the ratio from a very accurate team vs a low accuracy team if we don't want the rating to be decided by a handful of teams (as it is currently with CS:GO's overwatch system). I might propose some formulas for that system if you're interested

bl4de commented 7 years ago

I think idea @pedromigueladao of "tiers" is something what should be considered as a ground for new scoring system. Maybe I'm wrong, but you all should agree that some of CTFs are natural "Grand Slams" and are "the core" of every season.

Maybe this is something you should start from:

https://ctftime.org/ctfs

Consider top 20 CTFs from this list as "Grand Slams" and let their scores as they are right now, rounded to the nearest INT value. Other CTFs with at least 2 years history put on "Regulars", with actual score rounded as well. Newcomers starts with 5 points.

After each CTF every participate team can vote (note - a TEAM, not every player as it is now - team is responsible to choose voting member, I suppose it will be a captain) up/down. Every note is worth 5 points - up equals +5, down is -5pts, based on whole team experience during the CTF (variant fo @kara71 idea of excluding extreme values)

Pros:

Cons:

pedromigueladao commented 7 years ago

@bl4de but would you consider capping the number of CTFs/tier you could use for your ranking?

Pharisaeus commented 7 years ago

@pedromigueladao

I agree that you can upvote/downvote according to you performance and stategy but notice that item 5. is applicable only to next year's competition. We only have this issue for new CTFs

Then it's broken from the start ;) The fact that CTF was good this year does not mean it's going to be good next. The same goes for bad ones. We've seen this already. Some CTFs improve, other deteriorate, for various reasons. Often people who organize change (university teams change a lot over time etc). All ideas on "voting for next year" will suffer from the same issue.

We could even define a threshold. Say, if 60% consider it was a CTF-20, then most probably it was. It is fair unless 30+ teams are playing the system....

This could work if many teams were interested in voting, which is often not the case. And I think you would be surprised how many teams try to play the system ;)

but wouldn't it be easier to just play a CTF-5 competition and win it (if you did well in a CTF-20 you might as well do it)

Not really because the people who play are usually the same. There are some teams who skip easy CTFs, but apart from that you have a lot of teams playing most of the competitions. And in case of CTF-5 or CTF-20 there is little difference in participants list, so it's just as hard to win one as the other.

@bl4de downside of what you propose is that it would take many years for a CTF to get anywhere higher, even if the quality is high from the start.

immerse commented 7 years ago

@bl4de downside of what you propose is that it would take many years for a CTF to get anywhere higher, even if the quality is high from the start.

Exactly. People have brought up that the ranking should motivate teams to play (and thus learn). But to me it seems just as important to motivate skilled organizers to host quality competitions -- we have far too few of these as it is!

pedromigueladao commented 7 years ago

@Pharisaeus I guess that we have to assume that in a hacker's-group everyone will try to play the system. Let's just try to make it more difficult or just not worth it.

And yes, next-year's policy is like investments, no one knows how the future will be. But notice that you are discussing between CTF-5 and CTF-20. If you do not delay to a later stage, I guess you will always have the incentive to vote according to your current result.

@bl4de and @Pharisaeus also it is always unfair for newcomers. Eg, we played Google CTF and HackTheVote last year that in my opinion were very good competitions. This past two weeks we played 2 so-so CTFs and most probably they'll be worth the same as the previous two...

pedromigueladao commented 7 years ago

@immerse fully agree. That is why I proposed 2 possible entry levels CTF-5 and CTF-20. Combined with the capped values for each tier, I would like to believe that the top-50 teams would be able to fairly score a newcomer.

MathisHammel commented 7 years ago

I agree with @pedromigueladao, Google CTF + HTV were exceptional events. They clearly deserve more points than some regulars that recently dropped in quality. To me, having players vote for each CTF weight is crucial. The current system does not seem that bad, but may need some tweaks

bl4de commented 7 years ago

@pedromigueladao @Pharisaeus @kara71 @immerse

@bl4de downside of what you propose is that it would take many years for a CTF to get anywhere higher, even if the quality is high from the start.

@bl4de and @Pharisaeus also it is always unfair for newcomers. Eg, we played Google CTF and HackTheVote last year that in my opinion were very good competitions. This past two weeks we played 2 so-so CTFs and most probably they'll be worth the same as the previous two...

Exactly. People have brought up that the ranking should motivate teams to play (and thus learn). But to me it seems just as important to motivate skilled organizers to host quality competitions -- we have far too few of these as it is!

That's why I proposed additional discussion in special cases to increase newcomer more than 5 points, I think you missed this in my proposition :)

If there is a discussion about score system in general - why it can't be in certain cases as well, right?

immerse commented 7 years ago

@pedromigueladao You are assuming that the top teams will actually consistently vote. I don't think that has been the case so far.

bl4de commented 7 years ago

@pedromigueladao

@bl4de but would you consider capping the number of CTFs/tier you could use for your ranking?

No, not at all. 20 Grand Slams (a proposition), 1st year goes to newcomers, anything between are considered as Regulars.

pedromigueladao commented 7 years ago

@bl4de I agree that newcomers might be scored more than 5. In fact, I propose 5 or 20. The issue is who decides this. Observing the current status, score is 20-25 for every single CTF, no matter how good/bad it was.

@immerse I am not assuming anything. I just believe that in order to exist "an agenda", it has to be the majority agenda. Would it be fair to say that if 60% of top teams voted good, then most probably it was good? I would believe so.

@bl4de (again) well, then you will still have this crazy system that you can put up interesting numbers by brute-forcing the system

MathisHammel commented 7 years ago

The discrete ranking system feels like it has too many gaps. If there are two almost equivalent CTFs at the same time but one is tier 5 and the other is tier 20, most teams will try hard to have a good placement on the tier 20. Otherwise, adding more tiers (to stay a discrete scoring) would bring more smoothness. Also, looking at every newcomer to see if it's a tier 5 or tier 20 is similar to a vote, so why drop the idea of a vote ?

bl4de commented 7 years ago

@pedromigueladao

@bl4de (again) well, then you will still have this crazy system that you can put up interesting numbers by brute-forcing the system

Not exactly, because to gain more than 5 points for outstanding newcomer will require discussion like this and some kind of agreement based on facts.

bl4de commented 7 years ago

@pedromigueladao

The issue is who decides this

I think this is an issue in general :)

We discuss here about scoring system for something involving thousands of people all over the world.

And there are so far how many - ten of us here? Including people like me :D Scored somewhere above first 500 for last two years?

I don't see here anyone from top teams, organizers and so on. So, if they are not interested, I don't know, who should be and who should decide.

//EDIT

This repository is followed by 10 (!!!) people and starred by 16.

Last year ranking contains 12658 registered teams. Only top 100 teams have about 1000 regular players.

Something is wrong here :)

immerse commented 7 years ago

@pedromigueladao

Would it be fair to say that if 60% of top teams voted good, then most probably it was good?

Often there might just be a total of, say, 3 or 4 votes from the top teams, not the 10-30 votes you imagine. Statistically, it's just not enough to go by. Besides that, people don't vote fairly. Therefore I don't think voting is the solution.

Isn't the premise of having this discussion at all that voting isn't working? But you are proposing a voting-based system, or did I misunderstand?

pedromigueladao commented 7 years ago

@kara71 placing newcomers in their correct tier would be by voting. Currently, 90% is placed in tier CTF-20.

@bl4de sure that this year it will be fine. you get at most 5 point for so-so newcomers. 3 years from now those competitions might be worth 15 points and we'll have this discussion again.

@immerse yes and no. I propose having CTFs ranked before the event in tiers (except new ones)). You know how many points you are playing for (and then capping each tier to X events).

I only proposed voting for

  1. newcommers - restricted to top-50 teams from previous year's ranking
  2. promotion/demotion for next year's event I believe we strip away some of the passionate votes we see in the current system.
MathisHammel commented 7 years ago

Top 50 teams is too few to have enough votes imo. Otherwise, we could keep tiers 5-20-slam, and add a voting system that can only move the CTF upwards. For example, an "outstanding CTF" button would allow teams to give kudos to great CTFs and increase their score. I'm not sure if that system would solve the problem of all CTFs in the 20± ballpark if some teams spam kudos to all CTFs, but we could also add a limitation to the number of kudos (or a weight on how many kudos that team has recently given) to limit the effect of spam and really promote great CTFs. This way we would have a tier of [5-15] for new CTFs. Maybe the way to get into the 20 tier would be to reach max kudos (score=15) which would grant access to the tier 20 for a few years. Tier 20s can also receive kudos (maybe in another form to separate them from beginner CTFs). Grand slams would be decided in advance and no bonus for them.

immerse commented 7 years ago

@kyprizel Did you consider putting a link to this discussion on the front page of ctftime?

MathisHammel commented 7 years ago

I just realized how ugly my big text blocks look like when seen on a computer. Sorry !

bl4de commented 7 years ago

@immerse

I just wanted to add here one thing.

I really appreciate @kyprizel work with ctftime.org website and I see totally lack of interests in helping him with this. Forgive me @kyprizel, but now, ctftime.org from UI/UX point of view is a crap.

I've opened an issue in May 2016 about some improvement: https://github.com/ctftime/ctftime.org/issues/39

There's no comment there. Even one. And no doubt filtering events on ctftime.org needs to improve its usability.

There are so many people playing CTFs every single weekend. No doubt there are some great web developers there. CTFtime ranking is used by several technology related media as reference, when they post about CTFs, "hacking" IT Security contests and so on. Winning teams are famous, great events are famous, why ctftime.org couldn't be as well?

I don't see any reason why community can't be involved more in all this thing. CTFtime.org and @kyprizel (and any other person involved as well now and I don't know about) deserve it.

And maybe this is what we should start from?

@kyprizel I think it's good idea to add some kind of "News" section to ctftime.org to put such topics like this one

@kara71 You've opened #41 with your improvement proposition while I was typing this! Are you reading in my mind? :D

pedromigueladao commented 7 years ago

@kara71 indeed it can be more than 50 teams voting. that number just came to my mind but it can be a different one.

kudos-suggestion: a way to limit spamming is to assign X kudos/team to spend throughout a 12-month period. this way you might be conservative in giving your kudos away.

capped-suggestion: one can also adjust my capped-suggestion so that the N-top of each tier will count 100% for the ranking, and the remaining ones will decrease according to some formula we come up with. Say top-5 of CTF-20 count 100% for ranking, the 6th 90%, the 7th 80%, and so on... This way, every CTF is worth something. This could be applied to tiers CTF-20 and CTF-5.

MathisHammel commented 7 years ago

@pedromigueladao I thought about limiting the number of kudos by year, but it means that if we have 12 good CTFs which receive kudos from many teams and then an outstanding 13th CTF comes. Nobody has kudos to spend anymore.

Also, there seems to be a lot of proposed solutions where a CTF's ranking can change, even after the main voting has ended (like using the team accuracy score or kudos average). I'm not sure this is a good thing or not, we should settle on that. One downside would be that some teams could tweak what's in their control (such as kudos) in order to tune the CTFs scores to get a few more points at the end of the year. This is currently mostly suppressed by the weight being frozen after a week.

Now that the problem seems to have been pretty much figured out, we should probably start summarizing this thread which is starting to get pretty big, maybe list the issues and proposed solutions (with their + and -) ?

MathisHammel commented 7 years ago

Solution summary proposal :

CTF weights 101.There are 3 tiers for events: CTF5, CTF20, and Majors. 102.All new CTFs are added to the CTF5 tier 103.Older CTFs are CTF5 or CTF20, depending on their current rating 104.Big CTFs (DEFCON,...) are Majors, added according to a predetermined list 105.There are about 10 Majors, each worth 70 points 106.Users have a way to give a bonus to outstanding CTFs. Each vote counts as 1 kudos until 12 votes, then the value starts decreasing as 12/nKudos. 107.Every vote is worth ??? points (needs adjustment), which means that CTF5s are worth 5 or more, CTF20s are 20+, Majors are 70+. No negative votes. 108.When a CTF5 reaches 15 points, it is flagged as 'Great CTF'. Its value goes to 20 pts and freezes. The following year, the CTF will be a CTF20. 109.The value of a CTF is frozen after 2 weeks to prevent tweak cheating (see my last comment)

Team points per CTF

  1. Same as before (?)

Global ranking points

  1. Instead of summing all CTF points, we introduce a new formula where there is a new balance between playing a lot and playing good. A team can not lose points by playing a CTF. 301bis. The formula is the following : totalScore = sum(pow(1-1/p, i) * score[i]) where score[] is the list of CTF scores (for the team) sorted decreasingly. p is an adjustable parameter

If you want to discuss that, feel free to reference the corresponding number and we can make adjustments. Or, if you feel my proposal is mostly stupid, make a similar one with your ideas !

MathisHammel commented 7 years ago

I had to make a few choices in the previous list (and I certainly missed some of your opinions too). Alternative options are :

  1. Make a vote, or make a small committee decide

  2. Not sure about the figures there

  3. I feel like 1 kudos per month is balanced, not sure in practice

  4. Leave the score to grow over 15 points instead of moving it to 20 and freezing ? (The CTF20 promotion would be kept ofc)

  5. This would make early CTFs receive more points as spammers would still have their votes count as 1 instead of 1/nKudos. But I felt it would not be very significant and stabilize quickly

  6. Maybe not the most appropriate way to count individual points. We might need to run simulations on previous years data to see if it still fits.

301bis. Forgot to mention p is an adjustable parameter, values from about 4 to 20 seem appropriate, but we'll probably need some simulations too

pedromigueladao commented 7 years ago

@kara71 thanks for putting this up together. I generally agree with it although do not clear understand 106 and 107. Not sure if we will not end up again with this over-inflated rating. Knowing upfront what you are playing for is a nice property to have. This after-the-fact change should be kept to the minimum imho. Excluding of course new CTFs which can be given a special treatment.

in this sense, I am not sure how to limit kudos as it might be simple to arbitrarily pump a CTF up to 15 (which is 3 times the original score). I conceive a scenario where you register different teams every competition just to give kudos to it. Probably limiting who can give kudos would be a solution. Say, whoever got more than some-% of winner's score.

101+. if needed consider another bracket say Masters, CTF-40, CTF-20, and CTF-5 to include well established and respected CTFs that are not top-level (limiting the masters to the outstanding ones like Wimbledon et al)