Open moladeyal opened 5 months ago
Thank you for pointing out this potential issue! We are looking into this!
Sorry for the late response. I failed to reproduce the randomness from my end. Can you give a more concrete example like what are values in the three fields, and what are the outputs each time? Thank you!
I tried recreating this as well, and I think the output is deterministic, so I don't think there is any bug on the DB side.
I think what @moladeyal saw was that whenever "Order quantity" for multiple items is the same, their ordering is random which is not a bug.
However this example made me capture another bug/weakness of evaluation data. "0.json" says that correct answer is: "Quest Lumaflex\u2122 Band" while there seem to be three correct answers: 1) Dash Digital Watch (order quantity == 3 in 1Q 2022) 2) Quest Lumaflex™ Band (order quantity == 3 in 1Q 2022) 3) or both as their order quantities are the same.
There's indeterminism in the shopping-admin website, in the best sellers report. To recreate, visit this bestsellers report url and click "show report" several times. Each time it is clicked, there will be a random set of products shown.
This means that the for the tasks with ids 0-6 (e.g. "What is the top-1 best-selling product type in Quarter 1 2022"), there's a very high chance to get a different answer each time you execute, which will fail the current eval (more often than not).
A potential fix is to change the magneto configuration to allow for more then 5 rows per aggregate in the bestsellers report such that all items are shown. See this link.
Could you please advise? Thanks, Eyal