code-423n4 / org

Code4rena Governance and Discussion
69 stars 17 forks source link

Duplicates labeled with partials (partial-25/50/75) decrease duplicates weigth, but also reduce the primary finding weigth in the overall award calculation #145

Open dontonka opened 10 months ago

dontonka commented 10 months ago

Recently after having performed well the Zetachain contest, I was dissapointed with my awards. I had made the calculation kind of quickly at the end of PostQA and was hoping to get around 10k, but got only 7k, so that got me thinking, how is this possible, and I did the math again but accurately with the following spread --> zetachain calculation sheet and confirmed that there was a problem, or at least a misunderstanding from my part that was needed to be clarified, as my math were following the documentation.

After having a long discussion with C4 staffs in a private thread, we identified the source of the misunderstanding. C4 did update their documentation very recently as follow: awards documentation.

Unfortunatelly, C4 staffs and myself didn't came to an agreement/understanding, hence why I'm creating this issue so that it is discussed with the whole C4 community.

Here is what doesn't make sense to me

We can see here that the logic behind the partial- labels only impacts the awards for partial findings; even though the pies vary, the awards stay the same.

Conclusion:
Only the award amounts for "partial" findings have been reduced, in line with expectations. The aim of this adjustment is to recalibrate the rewards allocated for these specific findings. Meanwhile, the awards for full-credit findings remain unchanged.

Short version

One image is worth 1000 words, so here should be the proper value. This is good as it's exactly the case I had in Zetachain. I had a High with 2 dups which were at partial-25. What this means at high level is that this is "almost" a unique High, as the 2 other duplicates are classified partial-25, so the judge agrees that while they identify the same issue, they do find very little, which why they account for only 25% of a full duplicates. So they should be penalized (and they are in the moment as expected), BUT the problem is that the primary is not getting those rewards back!! Instead the finding's pie is reduced, so it's like saying this High is worth less then another High with the same numbers of duplicates (without partials), in fact this High almost become a sort of a Boosted Unique Medium with this pie at 4.86 (a unique med pie is 3.9). Do everyone understand how this doesn't make any sense OR it's just me?

The finding's pie CANNOT decrease, it's a High, it is not worth less then another High with the same amount of duplicates. By having the pie reduced, the rewards that the primary should have had is simply diluted among all the other wardens. image

Long version

Naaa, the short is enough and crystal clear.

aliX40 commented 10 months ago

I completely agree, and think that Code Arena systems should be designed to encourage the discovery of unique findings and to motivate wardens to produce well-constructed reports. When a warden submits the only valid (with "valid" in this context meaning a 100% rated issue) High or Medium vulnerability along with a high quality report, I believe that, from Code Arena's perspective, such findings should be exceptionally rewarded. Moreover, wardens should be given additional incentives to craft exceptionally thorough reports and to evaluate the full scope of the vulnerability comprehensively.

J4X-98 commented 10 months ago

I totally agree with the author. In my opinion these kind of issues maybe should stay the same for the report but should be split for rewards. I have also just encountered a case where there was a very simple bug due to non EIP712 compliance, which was recognized by ~90 people which provided very low quality reports of this (max med, more like low severity). Me and 3 others were able to leverage that bug to steal funds and make it a high severity. Now all those 90 people got 25% duplicated, making our high pretty much worthless in the process.

This functionality incentivizes people to as soon as they find a simple bug do a low quality report and let it be, as anyways someone else will spend the significant time to leverage the bug to a high severity, to which they will get duplicated.

To get back to the original idea, my recommendation for this would be to leave it as one issue for the report but split into 2 for rewarding. This can easily be done when there are submissions that are dups where some of them only are med/qa and some are high. Only the highs would get bundled into one and duplicated (resulting in higher rewards for people that put in significant time to leverage a bug) and all the meds would get bundled into a med not eating up the rewards of the highs.

This functionality would also remove the meta of just writing low effort submissions of every small bug without ever describing any valid attack path.

Minh-Trng commented 10 months ago

I totally agree with the author. In my opinion these kind of issues maybe should stay the same for the report but should be split for rewards. I have also just encountered a case where there was a very simple bug due to non EIP712 compliance, which was recognized by ~90 people which provided very low quality reports of this (max med, more like low severity). Me and 3 others were able to leverage that bug to steal funds and make it a high severity. Now all those 90 people got 25% duplicated, making our high pretty much worthless in the process.

This functionality incentivizes people to as soon as they find a simple bug do a low quality report and let it be, as anyways someone else will spend the significant time to leverage the bug to a high severity, to which they will get duplicated.

To get back to the original idea, my recommendation for this would be to leave it as one issue for the report but split into 2 for rewarding. This can easily be done when there are submissions that are dups where some of them only are med/qa and some are high. Only the highs would get bundled into one and duplicated (resulting in higher rewards for people that put in significant time to leverage a bug) and all the meds would get bundled into a med not eating up the rewards of the highs.

This functionality would also remove the meta of just writing low effort submissions of every small bug without ever describing any valid attack path.

This idea only works for this specific case. Now imagine 100 people put in the effort to report this as high and one guy is lazy and only reports the most obvious impact as med. He would get a solo med for putting in less work.

I think the way it works now does make sense, even if it does indeed not reward putting in effort for simple bugs. If all people find the same root cause of an issue then each of these reports would have had made the sponsor aware of its existance. And thats why the 100% finding only gets the credit of a finding with 3 duplicates. Because 3 people found the root of the problem, even if 2 didnt manage to see the full impact.

However, I do see dontonkas point. It could also make sense to redistribute the reduction for partial findings to all duplicates that scored 100%.

0xEVom commented 10 months ago

Instead the finding's pie is reduced, so it's like saying this High is worth less then another High with the same numbers of duplicates (without partials)

Agree that this doesn't make a lot of sense, intuitively a high with only duplicates with partial credit should be worth something between a solo high and a high with full-credit duplicates.

I was playing around with the reward formula and came up with a small change that I think could work:

slice = severityWeight * (0.9 ^ (totalCredit - 1)) * (sliceCredit / totalCredit)

Instead of counting the number of findings, we take the sum of the total credit awarded to all findings to calculate the pie. This means that a high with one duplicate is worth the same as one with two duplicates with 50% credit. That strikes me as reasonable.

And instead of dividing by the number of findings, we multiply by the finding's credit and divide by the total credit. This gives us the share of the credit for a given submission.

This is what the scoring would look like for the example above:

pie sliceCredit totalCredit slice
11.38 1.3 (1) 1.5 8.22
11.38 0.25 1.5 1.58
11.38 0.25 1.5 1.58

The pie and first slice end up being quite large, but that seems reasonable considering there are only 1.5 "findings". The effect is less extreme if the other 2 findings had 50% credit:

pie sliceCredit totalCredit slice
10.35 1.3 (1) 2 5.85
10.35 0.5 2 2.25
10.35 0.5 2 2.25

In this case, the first finding gets the same number of shares as if there was only one other duplicate, and the other two split the shares that would have gone to the duplicate.

I couldn't think of any bad behavior this would incentivize but I may be missing something, it would be great to hear what others think.

Because 3 people found the root of the problem, even if 2 didnt manage to see the full impact.

@Minh-Trng I can see what you're saying but at the same time, the sponsor may have decided the finding is a non-issue, or the findings could have been rejected if it wasn't for the high-quality report that correctly identified the highest impact. So putting in that extra effort is important, and should be incentivized.

GalloDaSballo commented 10 months ago

The suggestion effectively would allow for exploitation:

Partial submissions would steal total rewards from other findings And would credit them to the ideal submission

That's ultimately more gameable than what's available now

The feedback of disappointment in having a partial award reduce the rewards of your finding is legitimate, the suggested fix offers more downside than the current situation

0xEVom commented 10 months ago

Could you walk me through that? I wouldn't think it does since combined partial credit is treated the same as full credit.

I just tried out an example:

pie sliceCredit totalCredit slice
10.35 1.3 (1) 2 5.85
10.35 1 2 4.5

And with partial findings:

pie sliceCredit totalCredit slice
9.56 1.3 (1) 2.5 4.44
9.56 1 2.5 3.42
9.56 0.5 2.5 1.7

Shares to first + partial credit = 6.14 Shares to second = 3.42

I was thinking you were right but in fact this is also currently exploitable with a normal duplicate and I think it's just an edge case when findingCount == 2:

pie weight split slice
9.56 1.3 3 3.51
9.56 1 3 2.7
9.56 1 3 2.7

Shares to first + fake dupe = 6.21 Shares to second = 2.7

Otherwise any number of partial credit findings amount to their total combined weight (4 duplicates with 25% credit are paid the same as one normal duplicate), so it's not any more (un-)profitable than submitting duplicates.

0xean commented 10 months ago

tldr: We should reward unique information presented to a sponsor.

If we all agree (which I think we do) that a solo unique H has the most sponsor value, then it stands to reason that more duplicates, if they are in fact 100% duplicates, degrade the value of that report.

However, partial findings are typically marked as such because they didn't add the same value (or contain the same unique information) as a true (100%) duplicate and therefore the original H should be worth more to the sponsor because it represented unique information not present in other reports.

I understand concerns about the attack vector is opens up, but do think there is probably a solution in the middle that might make this more fair.

dontonka commented 10 months ago

tldr: We should reward unique information presented to a sponsor.

If we all agree (which I think we do) that a solo unique H has the most sponsor value, then it stands to reason that more duplicates, if they are in fact 100% duplicates, degrade the value of that report.

However, partial findings are typically marked as such because they didn't add the same value (or contain the same unique information) as a true (100%) duplicate and therefore the original H should be worth more to the sponsor because it represented unique information not present in other reports.

I understand concerns about the attack vector is opens up, but do think there is probably a solution in the middle that might make this more fair. @0xean The feedback of disappointment in having a partial award reduce the rewards of your finding is legitimate, the suggested fix offers more downside than the current situation @GalloDaSballo

Well said, and exactly rephrasing my thoughts. Here are the properties/requirements the award should aim to acheive:

So my actual solution would be to adjust the maths to reflects those requirements. What @0xEVom is proposing seems a bit more fancy, I would need to give it more thoughts, I was more thinking of a simpler solution which simply transfer the pie lost by the partial to the primary.

Keep in mind that today the current formula already penalize the primary with dup (on top of the issue I'm raising), as it reduce the finding pie (pie * (0.9 ^ (split - 1)) / split) as if the dup would be a full duplicate, which is already a negative impact for the primary. One could argue that the formula the split in case of partial dup should be less in the above formula (instead of 1), maybe this is where @0xEVom is going.

So to put this into a requiement, would be this additional point:

This additional requirement would make sense to me, as for the sponsor, that reflect the reality.

EDIT:

Ok did the math to include the additional requirement, seems to work perfectly (modified my google sheet from my original post).

Finding's pie would be calculated as follow, severity being 10 for High and 3 for Medium: (0.3( (0.9 ^ (<#-split-partial-100 - 1>1+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25)) / (<#-split-partial-100>+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25)))+(severity (0.9 ^ ((<#-split-partial-100 - 1>1+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25))))

Warden finding's pie would be as follow if selected for report: (0.3( (0.9 ^ (<#-split-partial-100 - 1>1+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25)) / (<#-split-partial-100>+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25)))+(severity (0.9 ^ ((<#-split-partial-100 - 1>1+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>0.25))) / (<#-split-partial-100>+<#-split-partial-75>0.75+<#-split-partial-50>0.5+<#-split-partial-25>*0.25)))

Which give the following formula in my current sheet for High-191

Finding's pie: =(0.3(10(0.9 ^ ((D18-1)1+E180.75+F180.5+G180.25)) / (D181+E180.75+F180.5+G180.25)))+(10(0.9 ^ ((E180.75+F180.5+G180.25))))

Warden's pie (mine, selected for report): =(0.3(10(0.9 ^ ((D18-1)1+E180.75+F180.5+G180.25)) / (D181+E180.75+F180.5+G180.25)))+(10(0.9 ^ ((D18-1)1+E180.75+F180.5+G180.25)) / (D181+E180.75+F180.5+G18*0.25))

image

So this would reflect the 3 requirements indicated. And let's try to rationalize the number:

Finding's pie:

Warden's pie (for the warden selected for report):

Are those making sense? I think so. I think this represent exactly how it should be.

So the formula in the doc would change to the following:

Med Risk Slice: 3 * (0.9 ^ ( ((split_100 - 1)*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )) / ( (split_100*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )
High Risk Slice: 10 * (0.9 ^ ( ((split_100 - 1)*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )) / ( (split_100*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )
0xA5DF commented 10 months ago

I agree this should be changed, I brought this up about a year ago and iinm the response to my help request was that this is the intended design.

To put it simply, I think the rationality behind C4's formula (when no partials are involved) is to first dilute the pot of the bug by 0.9^(n-1) and then split that pot between all wardens who found this bug. The problem with the partials is that a partial dupe reduces the pot of the bug by more than 10%.

I agree that the fix should be the formula suggested above by @dontonka, there are 2 points here:

dontonka commented 10 months ago

I agree this should be changed, I brought this up about a year ago and iinm the response to my help request was that this is the intended design.

Interesting. What was the resolution of the ticket exactly @0xA5DF ?

0xA5DF commented 10 months ago

I can't find the ticket in my email for some reason, but as I said, the help team said this is not a bug. I should've opened an org issue back then but I failed to do so.

PS I've updated my estimator script to reflect this change after I received that response.

dontonka commented 10 months ago

Yes this is really annoying, so many wardens didn't receive their proper share of award, but hey, things are not perfect but we need to learn from mistakes, but first recognize our mistakes/errors, fix them and improve as an organisation.

0xEVom commented 10 months ago

Yup that's exactly what I meant! Then the slices just need to be scaled by the partial credit to get the final number of shares for a given finding.

@0xA5DF note that there's also a partial-75 label now

0xA5DF commented 10 months ago

so many wardens didn't receive their proper share of award

I don't think there's any wrongdoing on C4's end here, org issue is the right way to change this kind of stuff.

@0xA5DF note that there's also a partial-75 label now

True, I need to update the script 🙂

dontonka commented 10 months ago

I don't think there's any wrongdoing on C4's end here, org issue is the right way to change this kind of stuff.

With all respect, I have to partially disagree. This is actually a bug in the current formula, we all agree that the current formula reducing the finding pie doesn't make any sense, and should not have been there in the first place, nor aligned with the interpretation wardens would have from the documentation. Either it was not audited or not understood by the C4 developer in charge of the implementation. Since this tool is a black box, I do think there is a wrongdoing (not tested properly, not audited, etc) in C4 personally for the negative impact this has generated for the wardens impacted, since this is present since the partial have been implemented.

I do agree that the right way to change is by involving the community with this post, which is in process.

sockdrawermoney commented 10 months ago

Since this conversation is now happening in two places, I'm going to bring it all here.

dontonka:

As I just posted on the issue, I do believe C4 should be accountable in a way for this, you guys provide the infra for the service, but unfortunately this was clearly a bug in the award tool when accounting to partial, it's not a documentation bug. You cannot just hide behind the fact that since that was there since a while and nobody complains, it's fine.

Why would we ask you to bring this discussion here to discuss it if we wanted to hide something? There are simply tradeoffs. At the point of implementation of the award calc for partials, there was the choice of how the awardcalc was written.

The title of this issue is exactly how it was written intentionally—to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact.

I get that there are folks who don't like how it was written and that's a completely fair and reasonable viewpoint. And my viewpoint may be a dated one that is good to revisit which is why the team asked you to open this issue and why we thanked you for doing so.

But "pie" and "slice" are anachronisms anyway which remain based on the way the award calc script was originally written. There are other areas which equally do not fit within the analogy of pies and slices: what does a "30% bonus" mean in the pie and slice context, for example?

The awardcalc script used to be open source. It was completely refactored a year or so ago and was intended to be made open. It will be again.

dontonka:

In a bounty this would by a High severity. Granted that moving forward, this can be changed. Don't get me wrong, C4 bring a huge net positive, but I don't think this is something that should simply be a simple change in the award tool after the next supreme court kind of thing and simply moving on. Personally, since I raised the alarm before the ZetaChain awards were sent, but C4 staff continue the process as usual,

In fact multiple members of staff looked deeply into this and engaged with you on the matter before moving ahead. I took it seriously when you raised it with me, pulled in multiple staff members, and this is where the analysis came to, along with the conclusion that if it makes sense to revisit the tradeoff, we will do so.

I think I should receive some retributions, and not from the wardens that have been paid, but from the C4 DAO revenue, period, at least the amount that I should have received from the ZetaChain contest (not accounting for anything that I could have deserved to bring this to light, and that's fine, we don't do all for money, at least I will have contributed a bit to C4 success)

I'm not sure how you can argue that you should receive retribution without arguing that everyone else impacted by this tradeoff should as well.

However if there were to be sufficient consensus among @code-423n4/judges that this is a bug and not an intentional tradeoff, then I think characterizing it as a vulnerability is perfectly reasonable and in such case the original identifier (@0xA5DF) should be awarded.

dontonka commented 10 months ago

However if there were to be sufficient consensus among @code-423n4/judges that this is a bug and not an intentional tradeoff, then I think characterizing it as a vulnerability is perfectly reasonable and in such case the original identifier (@0xA5DF) should be awarded.

Well, that would be a duplicate finding, mine would actually be the primary, and him a dup (probably partial-50), as his intent didn't result in any changes.

Anyway thanks for replying, I did say my part, I have nothing more to add.

I'm not sure how you can argue that you should receive retribution without arguing that everyone else impacted by this tradeoff should as well.

That would be ideal, but unfortunatelly unrealistic, hence why I haven't proposed it. That being said, having retribution to both @0xA5DF and myself would be adecaute too, definatelly. Now if you would disqualify me because I'm being too rude / honest / straight to the point, that's fine.

sockdrawermoney commented 10 months ago

I don't feel like you've been rude at all. I expected you to bring your position here—it's what was asked and what you've been thanked for multiple times—and of course it's rational to defend your opinion.

dontonka commented 10 months ago

However if there were to be sufficient consensus among @code-423n4/judges that this is a bug and not an intentional tradeoff, then I think characterizing it as a vulnerability is perfectly reasonable and in such case the original identifier (@0xA5DF) should be awarded.

Personally I don't understand the intentionality in the current behavior, and no judge with his right mind would either, and neither any warden. And this is confirmed in all the posts by them above, everyone agrees with my point that reducing the pie's finding IS wrong.

I challenge any judge/warden that think the contrary to explain me with clear arguments where the intentionality from the current behavior would come from and why it would exist, for what purpose it would have been implemented intentionaly that way and to acheive which goals. It is clearly unfair for the primary warden (but does the proper job for the partial dup penalizing them, at least XD), that is crystal clear, and the sponsor endup paying less that specific warden that would have deserve it and instead diluting this among all other wardens, which doesn't seem fair either for the sponsor, of course that's a minimum impact, he pays the same pot in the end.

Minh-Trng commented 10 months ago

However if there were to be sufficient consensus among @code-423n4/judges that this is a bug and not an intentional tradeoff, then I think characterizing it as a vulnerability is perfectly reasonable and in such case the original identifier (@0xA5DF) should be awarded.

Personally I don't understand the intentionality in the current behavior, and no judge with his right mind would either, and neither any warden. And this is confirmed in all the posts by them above, everyone agrees with my point that reducing the pie's finding IS wrong.

I challenge any judge/warden that think the contrary to explain me with clear arguments where the intentionality from the current behavior would come from and why it would exist, for what purpose it would have been implemented intentionaly that way and to acheive which goals. It is clearly unfair for the primary warden (but does the proper job for the partial dup penalizing them, at least XD), that is crystal clear, and the sponsor endup paying less that specific warden that would have deserve it and instead diluting this among all other wardens, which doesn't seem fair either for the sponsor, of course that's a minimum impact, he pays the same pot in the end.

I did provide reasoning above why the current solution is not illogical, which has been reiterated by sockdrawer. Even though I do agree that your suggestion is better, you shouldnt talk from a point of absolute truth about subjective matters.

I also find it a bit audacious to trying to argue for a retrospective reimbursement just because you made a suggestion for an improvement of a current solution

sockdrawermoney commented 10 months ago

Recognize that before this implementation of awardcalc that what we now call partials would get full credit if a judge assessed them as a functional duplicate!

You can likely find many cases from 18 months ago where a warden got awarded a full high or med as a duplicate without rationalizing the severity or describing the impact.

So this current implementation has to be understood as a transition away from that and toward something that did not overly award partials.

The assumption always was that partial duplicates were indeed duplicates per the decrementer (otherwise they presumably wouldn't be called duplicates at all?)

I think it's very productive to have a conversation about

  1. whether that assumption is correct (plenty of good arguments here it should not be the assumption) and
  2. what should the algo be going forward given we have a ton more experience and precedent with awarding duplicates than when we first added them.
dontonka commented 10 months ago

@Minh-Trng

I did provide reasoning above why the current solution is not illogical, which has been reiterated by sockdrawer. Even though I do agree that your suggestion is better, you shouldnt talk from a point of absolute truth about subjective matters.

I read it but it's not clear, as you reply to another warden mainly. My proposal is not to dedup and separate in different issue, which is what Sherlook btw I beleive, I much more prefer the partial approach, but the correct one XD.

In software, product owner write requirement A, that is given to the engineer which implements the code to reflect the requirement, afterward demo is done to proove the product owner the requirement really doing A, then the feature is deemed accepted and rollout to production. Explain to me like a 5 years old what is the requirement that the current behavior reducing the pie's finding is acheiving? And if there was an intentional trade-off, what was the constraint that introduced such trade-off?

I also find it a bit audacious to trying to argue for a retrospective reimbursement just because you made a suggestion for an improvement of a current solution

Definatelly, let's not focus on that thought, that's another topic, I am sorry. The root problem is because I don't feel I'm suggesting, but I'm fixing something broken, a bug.

dontonka commented 10 months ago

The assumption always was that partial duplicates were indeed duplicates per the decrementer (otherwise they presumably wouldn't be called duplicates at all?)

Yes agreed, this is where the suggestion come in (see point 2). So there are 2 items in play here:

  1. The bug: finding weight (aka pie, or whatever you call it) being reduced when some finding's dups are partials. This is what this post is mostly about, which I consider a bug, not a suggestion. The solution here would be to transfer such reduction among those that are 100% evenly. This definatelly complexify the calculation a bit, but bring overall much more fair award for wardens.

  2. Suggestion: Consider partial dup weight to establish the finding's weight. This is something that make sense to me, but is more a suggestion that need to be discussed and would end up with the following. The high level idea is challenging the assumption you stated (partial dup being full decrementer), they are duplicates yes, but partial duplicates, consequently they should weight less, or said differently (and to echo what @0xean stated) such finding would be more valuable for the sponsor compare to another finding with the same amount of dup (but without partial).

    Med Risk Slice: 3 * (0.9 ^ ( ((split_100 - 1)*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )) / ( (split_100*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )
    High Risk Slice: 10 * (0.9 ^ ( ((split_100 - 1)*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )) / ( (split_100*1) + (split_75*0.75) + (split_50*0.5) + (split_25*0.25) )
dontonka commented 10 months ago

Recognize that before this implementation of awardcalc that what we now call partials would get full credit if a judge assessed them as a functional duplicate!

Yes that's clear, there was a before and after, and it's fine, it's an evolution. Now once we evolved with partial, seems like it was understood one way, and behaved a different way for an edge case (primary warden being negativelly impacted as pie reduced), and since the tool is a blackbox and warden fully trusted C4, and since result mostly looks good, this gone unotice for this long.

The part that you might not understand @sockdrawermoney is that right now partials are only doing their job for the dup with the partial, penalizing them, but for the primary (those at 100%), their slice remain the same as before partial world was introduced! So for them partial doesn't bring any plus value, which is The bug, but you guys say it's an intentional trade-off.

dontonka commented 10 months ago

to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact

Yes, and it does it well for the partial dup. While it's true that they compensate the primary more then the partial dup, where is the intentionality that a finding's weight is also decreased because some dup are partial? Yes it is decreased as it has more duplicates, but having partial duplicates should not reduce the finding's weight further (which is what happen today, the bug), but it should at least remains constant, or even better (the suggestion) even increase.

Minh-Trng commented 10 months ago

In software, product owner write requirement A, that is given to the engineer which implements the code to reflect the requirement, afterward demo is done to proove the product owner the requirement really doing A, then the feature is deemed accepted and rollout to production.

Well, that reference to software engineering really doesnt help your argument. You have the actual product owner here with you, telling you that the implementation does what it was supposed to do (and even how the requirement came to be) and that it is in fact not a bug.

Explain to me like a 5 years old what is the requirement that the current behavior reducing the pie's finding is acheiving?

all explained clearly by sockdrawer above

dontonka commented 10 months ago

all explained clearly by sockdrawer above

The only thing I see is the following, which is not clear to me, unfortunatelly. That doesn't explain why there is a reduction in the finding's pie, at least I don't see it, and what are the tradeoffs, and what choices? That is what I'm asking: 1) Requirement: what was the initial requirement and was it intentional to reduce the finding's pie as it is behaving today? That is a Yes or No question. 2) If it was intentional, why and what is the reasoning behind it? Such behavior bring only half of the work of the partial to the table (penalizing partial dub, but missing the other half which would be to redistrubute this reduction to the full dup evenly)

There are simply tradeoffs. At the point of implementation of the award calc for partials, there was the choice of how the awardcalc was written. The title of this issue is exactly how it was written intentionally—to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact.

Minh-Trng commented 10 months ago

The title of this issue is exactly how it was written intentionally—to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact.

The assumption always was that partial duplicates were indeed duplicates per the decrementer (otherwise they presumably wouldn't be called duplicates at all?)

This answers the question regarding the initial requirement very well.

Such behavior bring only half of the work of the partial to the table (penalizing partial dub, but missing the other half which would be to redistrubute this reduction to the full dup evenly)

thats just you designing different requirements, according to what you would like the behavior to be. It was not part of what was originally needed, can you please stop pretending as if it were? The whole concept of the findings pie that needs to remain equally large for the same amount of dups is just something you thought of to argue for a change. It never was part of the original specification.

Most people (including me) already do agree that your suggestion would be an improvement to what is currently done, isnt that enough? For some reason you seem very stubbornly focused on declaring it as a bug, even if the product owner (who knows best), tells you it isnt

dontonka commented 10 months ago

Alright, I'm definatelly stubborn, sorry. That will be my last post regarging this arguments with you @Minh-Trng, as prooving if it's a bug or not in the end is not the end goal, but to improve the award situation instead for the sake of all the C4 wardens, and I think we all agree with that.

The way I understand this is the following. Let's break it down. Scenario: High with 3 duplicates (two partial-25) as shown in the documentation and as I had in my contest.

Requirement: to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact. The assumption always was that partial duplicates were indeed duplicates per the decrementer (otherwise they presumably wouldn't be called duplicates at all?)

What I understand from this is the partial are duplicate as any other, so they should act as full decrementer. That's not what they are doing! They act as full decrementer on steorid, as it decrease the finding weight more then normal duplicate (from 8.91 to 4.86 in the current scenario, which is almost a 50% cut of the finding, so your High become almost a Medium!). Expected? Why? That's what I want to clarify with this argumentation.

The current calculation acheive otherwise the remaining part of the requirement without compensating the duplicate equally based on not having fully rationalized the severity/impact as you can see the primary get a higher slice then the partial-25 dup. image

I don't have anything more to add, to argue and spend enough time explaining my position on this issue as a whole.

I wish all of the participants of this sensitive discussion a wonderful weekend!

0xA5DF commented 10 months ago

They act as full decrementer on steorid, as it decrease the finding weight more then normal duplicate

They indeed decrease the finding's weight, but at the end of the day the warden with full credit gets the same share as if the partial dupes were dupes with full credit. So it makes sense to say that the partial label is just a penalty on the partial submission for not fully identifying the severity, and the funds from that penalty go back to the HMs pool rather than to this specific bug's pool.

It indeed makes more sense for the funds to go back to the bug's pool, but it's not as if the current state doesn't make sense at all. I don't think you can argue it's so unreasonable that we should make retroactive changes.

dontonka commented 10 months ago

They act as full decrementer on steorid, as it decrease the finding weight more then normal duplicate

They indeed decrease the finding's weight, but at the end of the day the warden with full credit gets the same share as if the partial dupes were dupes with full credit. So it makes sense to say that the partial label is just a penalty on the partial submission for not fully identifying the severity, and the funds from that penalty go back to the HMs pool rather than to this specific bug's pool.

It indeed makes more sense for the funds to go back to the bug's pool, but it's not as if the current state doesn't make sense at all. I don't think you can argue it's so unreasonable that we should make retroactive changes.

Well said @0xA5DF and exactly to my point, finally. I think we all agree that it make more sense to have those funds back to the bug's pool and not the entire HM pool (even @Minh-Trng), as by doing so you are effectivelly dismishing the finding's value, and for the sponsor this finding is not less valuable, in fact, it is actually more valuable, as the dup are not even full duplicates. This is the basic reason why I think this is a bug, unfortunatelly, your arguments are not changing my opinion in that respect.

Going back again to my original question (I'm sure @Minh-Trng will love this XD). Was this intentional OR not? If intentional, to acheive what objective exactly? To keep the warden primary having the same share? Why this should even matter, the warden's share is an output of the finding's value and should not be driving the finding's value. The invariants seems clear.

  1. Finding's value MUST remain constant (The bug - part 1). Otherwise what you are saying is that this finding is less important for the sponsor and that is clearly false. This was not happening before partial were introduced, which is code smell (bug).
  2. Partial duplicates must have their share penalized from that specific finding (that works today, great).
  3. If we all agree on 1 and 2, then those share needs to get back to the Primary (ies) on the finding (The bug - part 2), why would they go to anyone else?

The problem with the partials is that a partial dupe reduces the pot of the bug by more than 10%.

@0xA5DF this is relative to how many duplicates and how many have partial, and which partial, partial-25 doing the most damage. In the case presented above, it's almost a 50% reduction!

I don't think you can argue it's so unreasonable that we should make retroactive changes.

I personally think so, but it's fine, accountability and opinion are a subjective matter, so let's not focus on that, it's much less important then what changes will this issue will bring in the future. This topic was in a private thread and was bring into the public discussion here (without my consent, but it's fine I'm not hiding anything), which was not really needed (and caused mostly noise), but that doesn't matter in the end.

Minh-Trng commented 10 months ago

That will be my last post regarging this arguments with you @Minh-Trng

And here I was hoping you meant it. but fine, I will bite :)

They act as full decrementer on steorid

No they dont, they are applied to the formula equally, 3 dups yield the same intermediate value for each submission (2.70 shares), regardless of partials. That meets the 1st part of the requirement which is to decrease the value of the identified bug based on others spotting the root cause.

Then the shares of the partial findings get slashed, which meets the 2nd part of the requirement which is without compensating the duplicate equally based on not having fully rationalized the severity/impact

Thats it. You see, the reasoning is very simple. Everything else you are mentioning has never been part of the requirement and is just some form of mental gymnastics to meet a narrative

Also, you seem to conflate total shares of a finding with value to sponsor, which is not the same, but that is a rabbit hole I am not interested in going down for now

dontonka commented 10 months ago

And here I was hoping you meant it. but fine, I will bite :)

My whole post is just to see if you bite further hahahaha :P, good discussion @Minh-Trng. That is what discussion is, we need to be able to debate while having opposed opinions. At least we both agree that this need to be improved, and that is what's matter in the end, not why it was implemented in such way.

dontonka commented 10 months ago

Let me break this down in the most simple terms I can and with money, not pie, split or complicated terms, so that any warden can understand what's going on quickly. Let's not consider the 30% bonus for selected report to keep this simple.

Before partial were introduced

They split the 15000 USD evenly, so everyone get 5000 USD. Easy.

With partial (now, well since it was introduced, maybe 1-2 years idk)

What happen today is the following: primary warden get 5000 USD (same as before partial, shame) partial-25 wardens receive 1250 USD each, so that's total 2500 USD. (partial doing his job here, great)

Hey but that's worth only 7500 USD, where is the other 7500 USD going???

Well that is what I'm claiming it is a bug and need to be fixed ASAP, this 7500 USD today goes back to ALL the other wardens participating in the contest, and @sockdrawermoney @Minh-Trng @0xA5DF you guys are trying to convince me that this is reasonable and was intentional? What, are we auditing on charity here? Let's do something then, let's assume I'm your boss and this morning I enter in your office to annouce you that starting today I'm gonna cut your salary by 50% and distribute the other half to all the other employee, is that reasonable? Could it be at least the warden's call to give his share if he wants to give it away and not the system imposing such behavior? Mann I need this drug too. Also if the product owner in charge really intentionally designed this, he need to be fired ASAP lol.

On a more serious note I don't think that is the case, and the problem here is this is a bug and wasn't intentional, and is hard to identify by an external warden as the award tool is not open source, so only C4 employee have access to the source, and there was a need for a stubborn warden to bring this to light. This is almost hilarious that I need to fight so hard to get pass my message, communication skills is definatelly not my strong skill, to say the least.

Minh-Trng commented 10 months ago

fight so hard to get pass my message, communication skills is definatelly not my strong skill, to say the least.

While I agree with this, you can rest assured that everyone understood your message the first time already, you really dont need to repeat/rephrase it 10 times.

The problem is that you are so tunnelvisioned and frustrated about your Zetachain results, that all you can think about is this particular example and how you "should have gotten more money", that you are refusing to look at the issue from a different angle. The goal of the current solution is to make sure worse reports dont get as much money as you do.

Let me give you an example this time to explain under which assumptions the current solution is perfectly reasonable:

3 people find a rounding issue that causes loss of yield. Medium severity, so they get:

Person A -> 0.81 shares Person B -> 0.81 shares Person C -> 0.81 shares

Turns out person A finds out that you can use it to steal funds. So the rounding issue that was found by 3 people now has high severity implications, so people get:

Person A -> 2.7 shares Person B -> 2.7 shares Person C -> 2.7 shares

So person A now gets 3.3 times more rewards compared to before for figuring out that this is a high not a medium! But wait, the other 2 didnt see that. So lets introduce partial-50 to make sure they dont get as much as A:

Person A -> 2.7 shares Person B -> 2.7*0.5 = 1.35 shares Person C -> 2.7*0.5 = 1.35 shares

Question: Which of these reports provided the most value to the sponsor? Answer: they are all equal, because in this case the sponsor only needs to know that there is a rounding issue that needs to be fixed. Thats why A doesnt get more shares than a 3-duplicate-High. But he does get the full 2.7 shares for recognizing the High impact, while B and C get a penalty. The incentive model still works, just not as much as you would like it to.

So the current solution was designed under the assumption that finding the root cause is what matters most in defining the uniqueness/value of an issue and that the duplication penalty should be applied as such. Different people have provided different valid reasons in this thread that there are scenarios where this assumption doesnt hold. That doesnt mean the current solution is unreasonable or that the implementation is bugged.

It also doesnt mean, as you like to put it, that no one in their right mind would understand this, or that you need to take drugs to come up with this, or that the product owner needs to be fired ASAP lol.

It merely means that its an imperfect design that needs reconsideration.

Simon-Busch commented 10 months ago

@dontonka ,

I've already exposed our point of view and updated the documentation to help you understand our position within C4 and our current implementation.

The introduction of "Partials" addresses instances where the judge recognizes findings as duplicates within the same category, yet notes they contribute less value or are lacking in some aspect.

To address this, we've implemented a penalty ( Partials ) reducing the finding's share by 25%, 50%, or 75%, as stated in the documentation.

This ensures that Partials findings within this duplicate group receive a reduced award. The "Partial" label is not intended to enhance the awards for other findings. A partial finding still qualifies as a valid finding in a group of duplicates but is penalized. In the docs:

Please note that findings with partial credit still count as 1 finding in the algorithm.

As a matter of fact, the satisfactory findings in a group of duplicates will be equally awarded, with or without partial in that group. See: https://docs.code4rena.com/awarding/incentive-model-and-awards#duplicates-getting-partial-credit

Alex has pointed out a significant concern, which, I believe, should be carefully considered:

The suggestion effectively would allow for exploitation:

  • Create ideal submission
  • Create a group of partial submissions Partial submissions would steal total rewards from other findings
And would credit them to the ideal submission That's ultimately more gameable than what's available now The feedback of disappointment in having a partial award reduce the rewards of your finding is legitimate, the suggested fix offers more downside than the current situation

Thus, I don't consider this a 'bug' as you've suggested since a bug implies unintended behavior, whereas the current function aligns with our design. Your feedback proposes a change aiming for a desired different behavior from the algorithm, which differs from a bug.

I find your proposal to reallocate the reduced reward from the partial findings confusing for the following reasons:

As a result, submissions tagged as partial would be at a double disadvantage. Consequently, the criteria and calculations needed for a partial finding to be upgraded to satisfactory would no longer be achievable, and the requirement won't be met anymore. In the docs:

25%, 50%, or 75% of the shares of a satisfactory submission in the duplicate set



if some dup have partial, that SHOULD NOT reduce his weigth, it doesn't make sense, the finding has the same value for the sponsor

Regarding your earlier message about the importance of findings to the sponsor, it's not about the sponsors valuing findings solely based on their monetary value; they consider the merit of all selected issues. Let's clear things out here: A sponsor doesn't rely on the pie && || award to assess the value of a finding. The sponsor cares about identified vulnerabilities, whether they are solo, groups of 20 duplicates, or groups with or without partial.

sockdrawermoney commented 10 months ago

if the product owner in charge really intentionally designed this, he need to be fired ASAP lol

Man I left this thread alone for two days and came back to find out I got fired.

image

dontonka commented 10 months ago

@Minh-Trng

The problem is that you are so tunnelvisioned and frustrated about your Zetachain results, that all you can think about is this particular example and how you "should have gotten more money", that you are refusing to look at the issue from a different angle. The goal of the current solution is to make sure worse reports dont get as much money as you do.

That doesnt mean the current solution is unreasonable or that the implementation is bugged.

That is clear :), but I'm trying out hard to keep my mind open, and your last post is actually openning the door, congratulation, thank you for helping me understand better with a clear example, finally something with substance instead of spending your time trying to put name's on my behavior (stubborn, audicious, too confident, etc). I do agree in such case that there is a possibility for this to be intentional, but it's border line reasonable.

Different people have provided different valid reasons in this thread that there are scenarios where this assumption doesnt hold.

Now, let's try to move on (that's more on me lol, I know) and focus if it should remain as it is or not, shall we.

@Simon-Busch

I find your proposal to reallocate the reduced reward from the partial findings confusing for the following reasons:

  • Those marked as partial (with labels such as 25-50-75) will already suffer due to their partial status.
  • The redistributed amounts would then benefit other findings. As a result, submissions tagged as partial would be at a double disadvantage. Consequently, the criteria and calculations needed for a partial finding to be upgraded to satisfactory would no longer be achievable, and the requirement won't be met anymore.

I'm not really following tbh. Those partials are indeed penalized as they should right now. Redistrubuting their lost share to others findings (which could benefit them, granted) is not that much the point, if their initial issue where Medium and got upgraded to High because of the main warden, we can argue they get an increase there too. I think the main question is the following:

We need to evaluate which of those two give the best outcome. You already know what I think right :).

@sockdrawermoney

Man I left this thread alone for two days and came back to find out I got fired.

hahaha, you should have stayed in the bed this morning :P. I retract, no worries. As stated I think it does make some sense in the end, but border line from my perspective.

sockdrawermoney commented 10 months ago

@dontonka I thank you humbly for your generosity. I will call Wendy's and tell them to disregard my application.

dontonka commented 10 months ago

@dontonka I thank you humbly for your generosity. I will call Wendy's and tell them to disregard my application.

Cool now that you are back as C4's CEO, can you please revisit the current issue and evaluate with your team if the discussion/concerns raised here are in vain, or maybe it make sense to change the current partial behavior?

GalloDaSballo commented 10 months ago

Can we please see the math if the finding with partials did not have any partials? What would be the pie? What would be the slice?

dontonka commented 10 months ago

Can we please see the math if the finding with partials did not have any partials? What would be the pie? What would be the slice?

This is well described in the updated documentation

First table is the finding without partial, and the second table same finding but with 2 partial-25

image

GalloDaSballo commented 10 months ago

I'd like to see the math for a unique finding in the same example, what would be the Pie and Slice if the finding is unique?

dontonka commented 10 months ago

what would be the Pie and Slice if the finding is unique?

For a unique High both Pie and Slice will be the same which is 13, so full award (10) plus the 30% bonus for being selected as report.

dontonka commented 10 months ago

You see how the current behavior is kind of frustrating having a High with 2 partial-25, this means it's almost a unique High in a way (depending on why the partial-25 were assigned, I understand this can be contextual on the issue and debatable if the root cause was identified or not by those, on the issue itself etc), but this gives the primary warden a resulting slice of 3.51 instead of 13 (if it would have been really unique). There should be probably a middle ground.

But the most fundamental property for me that I can't reason about, is how come this would decrease the finding's pie, it's simply not a property that partial should be affecting at all, but only the slices on the finding.

H4X0R1NG commented 10 months ago

I totally agree with the concerns @dontonka raised here and I totally understand how it feels to lose a chunk of the rewards you should get after sleepless nights of work as a result of this bug, because I myself got affected by this same exact issue.

I strongly believe this is a bug. I am yet to see anybody providing a reasonable justification for this bug being treated as anything but a bug that needs to be fixed.

If I report a high bug with two duplicates that has two 25-partial duplicates, and @dotonka reports another high with two duplicates which have no partials, why in the world should @dotonka get the same reward share as me? That makes absolutely zero sense.

@Minh-Trng

The problem is that you are so tunnelvisioned and frustrated about your Zetachain results, that all you can think about is this particular example and how you "should have gotten more money"

What he is raising is a legit issue that needs to be fixed. That's a fact. Him being frustrated about this is 1. his right and 2. it doesn't change the fact that this needs to be fixed. I'm not sure what you're arguing about here. I'm also pretty sure if you were directly affected by this you wouldn't have said something like this.

The goal of the current solution is to make sure worse reports dont get as much money as you do.

The proposed solution will make sure that this is the outcome as well.

If a high severity bug is reported with 2 non-partial duplicates, the finding pot would be 8.1, each finding would get 2.7 shares. That's 8.1 shares total (2.7 x 3) (excluding best report bonus) If a high severity bug is reported with 2x 25-partial duplicates. the finding pot should remain 8.1, not get downgraded to 4.86. Main finding should get 6.75, the two 25-partial duplicates should get 0.675 each. (6.75 + 0.675 x 2) = 8.1. That should be still 8.1 shares total.

So worse reports will surely not get as much money as the primary report, and the primary report gets the treatment it deserves. Issue fixed.

dontonka commented 10 months ago

I totally agree with the concerns @dontonka raised here and I totally understand how it feels to lose a chunk of the rewards you should get after sleepless nights of work as a result of this bug, because I myself got affected by this same exact issue.

I strongly believe this is a bug. I am yet to see anybody providing a reasonable justification for this bug being treated as anything but a bug that needs to be fixed.

If I report a high bug with two duplicates that has two 25-partial duplicates, and @dotonka reports another high with two duplicates which have no partials, why in the world should @dotonka get the same reward share as me? That makes absolutely zero sense.

Thanks, that means a lot @H4X0R1NG. Chainlight in zkSync lost 18k USD if my calculation are accurate. Personally, the only way I can ensure this doesn't happen to me again is simple, not participate in anymore C4 contest (I understand that the warden's here could not careless and might actually by happy of that), until this is changed, period. The C4 organization can do whatever they think it's best, and I hope it will be aligned with my concerns (for theirs good and mine), but if not, too bad.

Minh-Trng commented 10 months ago

I'm not sure what you're arguing about here

SInce you are new to this convo, I think you may have not caught up on all the context necessary to understand

The proposed solution will make sure that this is the outcome as well.

That reaffirms what I just said about missing context, the line you quoted is not even remotely used as an argument against the proposed solution.

I'm also pretty sure if you were directly affected by this you wouldn't have said something like this.

I am sure I would have, because its based on reason, not personal bias. If you look further up, you can see that I have already confirmed multiple times (again, context...) that I believe a change to be an improvement. So if anything, I should argue for it, no?

In fact, there is a very good chance that I have been affected by this before

dontonka commented 10 months ago

OK I think warden's participation has been enought already to raise the concerns to the C4 DAO, so let's not waste anymore warden's energy on arguing each other and having ego's battle, etc, not constructive.

Let's all grab a 🍿 and check what C4 DAO will do with all those.

Have a wonderfull day ❤️ !

Simon-Busch commented 10 months ago

@dontonka

I'm not really following tbh. Those partials are indeed penalized as they should right now. Redistrubuting their lost share to others findings (which could benefit them, granted) is not that much the point, if their initial issue where Medium and got upgraded to High because of the main warden, we can argue they get an increase there too. I think the main question is the following:

Should the lost finding's shares be redistrubuted to the warden(s) evenly (those with no partial) on that specific finding OR Be redistrubuted to all other finding (as it is today)

"Redistrubuting their lost share to others findings[...]is not that much the point" In reality, this seems to be a central aspect for you.

Given that the partial findings already undergo an award reduction through the partial label attribution system (25%, 50%, 75%), based on the existing formula, theoretical redistributing of the reward further inflates the value for others. This results in an even more pronounced disparity between a standard satisfactory finding and a partial finding. In essence, a partial finding's value is diminished twice through:

  1. The application of the partial percentage.
  2. The theoretical award re-distrubution to other findings ( proportionally re-decrease the value of the partial finding ).

This potential double penalty means that the requirements of the documentation would no longer be met, specifically the guideline that states, "25%, 50%, or 75% of the shares of a satisfactory submission in the duplicate set."

It's important to clarify a possible misconception that a partial finding is nearly inconsequential and of minimal quality. On the contrary, should a finding be of such low quality, it would be classified as "unsatisfactory." Thus, a partial finding is, in fact, a valid contribution, with certain limitations, it still identifies a vulnerability and it is still counted as 1 valid finding in a group of duplicates.

@GalloDaSballo

I'd like to see the math for a unique finding in the same example, what would be the Pie and Slice if the finding is unique?

In the case of a solo findings, the slice is equivalent to the pie as it's shared between 1 finding.

High Risk Slice: 10 * (0.9 ^ (split - 1)) / split
=> 10*(0.9^(1-1))/1 = 10

To which we add the bonus for selected for report (30%) = 13

@H4X0R1NG

why in the world should @dotonka get the same reward share as me? That makes absolutely zero sense.

This issue has been addressed and explained in detail here: https://github.com/code-423n4/org/issues/145#issuecomment-1926520606