polkastarter / polkastarter-lottery

MIT License
5 stars 5 forks source link

Luck factor being computed many times for every participant #3

Closed CodeWithMichal closed 2 years ago

CodeWithMichal commented 2 years ago

Hi, I placed detailed explanation on canny, regarding my concerns: https://polkastarter.canny.io/bug-reports/p/in-depth-analysis-of-lottery-algorithm

To summarize: According to https://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf luck factor should be computed only once for participant.

Whats wrong in current implementation: -Luck factor computed many times for every participant what makes it much less random -shuffled_eligible_participants method returns not unique list of participants, meaning some users are omitted

CodeWithMichal commented 2 years ago

@tiagom87 @miguelcma any update?

CodeWithMichal commented 2 years ago

@tiagom87 @miguelcma 27 days without any response

CodeWithMichal commented 2 years ago

@tiagom87 @miguelcma so you decided to remove whole bug section from canny instead of answering. 👍

miguelcma commented 2 years ago

Hey @LabuzzMichal , I'm really sorry for our delay. I didn't notice this issue somehow. Thank you for your review in detail into code.

If you take a look at LotteryService#calculate_winners you'll notice that uniqueness is ensured by the line lib/services/lottery_service.rb:86: winners.uniq.first(max_winners)

We're aware that this is not the most efficient algorithm (because it is creating a huge array in memory), but it's 100% correct. It was also audited by external entities/people more than once (both the code and the results) and all of them conclude the same correctness. A much more efficient way would be to use a ruby Set instead of Array, but we'll want a much more efficient algorithm: we're working on refactoring to a more mathematical approach instead of using Sets or Array. However, take into account that this refactor will only affect efficiency, not the accuracy of it, because it is correct already.

Thank again you for your thoughts

CodeWithMichal commented 2 years ago

Hi @miguelcma,

thank you for the update! I hope this time answering won't take that long :)

It was also audited by external entities/people more than once (both the code and the results) and all of them conclude the same correctness.

And now it was audited once again by myself and it failed. Please refer to my findings and answer them instead of saying "it was audited". And guys PLEASE treat it with due attention! This kind of bug might potentially lead to financial losses for those who act based on wrong assumptions

If you take a look at LotteryService#calculate_winners you'll notice that uniqueness is ensured by the line lib/services/lottery_service.rb:86: winners.uniq.first(max_winners)

Somehow from the whole wall of text on canny, you decided to refer to uniqueness, which I already said that IN THE END it's unique but it's still an issue! What about all other concerns I have?

@miguelcma Could you please reopen this issue and keep it open until we both agree on whether it's an issue or not? Let's discuss!

Let me repeat it again, this time in a much more technical manner. I want to start with lottery algorithm assumptions, then discuss a correct version of the algorithm for such a problem and at the end, I will show what exactly is wrong with the current implementation

Assumptions

Weigted Random Sampling (WRS) algorithm

Mathematically this problem might be simplified to sampling K items from a population of N weighted items This algorithm is taken directly from this publication: link

image

possible ruby implementation:

  winners += weighted_random_sample
  winners.first(max_winners)

  def weighted_random_sample(participants)
    sorted = participants.sort_by do |participant|
      rand ** (1.0 / participant.tickets)
    end
    sorted.reverse
  end

Human-friendly code explanation:

Your implementation

  winners += shuffled_eligible_participants
  winners.uniq.first(max_winners)

  def weighted_random_sample(participants)
    participants.max_by do |participant|
      rand ** (1.0 / participant.tickets)
    end
  end

  def shuffled_eligible_participants
    (max_winners * 5).times.map do
      weighted_random_sample(participants)
    end
  end

Human-friendly code explanation:

Comparision and conclusions

Final thoughts

You might say that this analysis is unfounded without real tests. Fine! Give me a few days to add PR with comparison in code. @miguelcma @tiagom87 kindly address the above

Edit: post updated

miguelcma commented 2 years ago

Thank you so much for your closer look on this. I'm reopening the issue as you suggest, to allow the discussion to happen. Let us analyse this in detail with the rest of the team, and also invite them to participate in the discussion.

CodeWithMichal commented 2 years ago

Hi @miguelcma ,

I was checking once again simulations I did and you were right - results are fine and your algorithm is working correctly. It is a little bit overcomplicated but results are fine. I made a mistake in my code, when I was checking it.

I feel ashamed and I am sorry for bothering you. Issue could be closed.

miguelcma commented 2 years ago

No worries at all @LabuzzMichal !

Discussion is great and having input from the community is crucial for quality, that we take very seriously. Not only to be able to improve the code, but also to make sure that everything is accurate. So, your input was very important 🙌 thank you

tiagom87 commented 2 years ago

Thank you for the proactivity @LabuzzMichal !