ppy / osu-performance

Calculates user performance aggregates from scores
GNU Affero General Public License v3.0
242 stars 45 forks source link

Statistical analysis of imbalance between speed star rating and aim star rating: aim is overweighted. #77

Open grumd opened 5 years ago

grumd commented 5 years ago

Introduction

I used my own dataset from https://grumd.github.io/osu-pps/ to see if I can find anything interesting.

This dataset relies on one assumption: if a map is overweighted (easier to get PP) then it will more often be one of the top pp plays for many different players. So I gathered statistics of tens of thousands of players and aggregated their top plays to see which maps are the most popular PP sources. Additionally, maps that are often a top 1 play receive more points than maps that are a top 5 play.

Dataset

I ended up with a dataset of ~75000 maps. I sorted it from most overweighted to least overweighted. I took 1000 maps from the dataset (every 75th map). I used oppai on all of them. I calculated star rating: speed stars and aim stars. I also calculated pp, including aim pp and speed pp. I then divided aim values by speed values to get a ratio. If this ratio is higher than 1, it means this map is more aim-based, gives more pp for aim, has more difficulty for aim.

I made a scatter plot showing how these ratios between aim difficulty and speed difficulty correlate with overweightness of a map.

Results

Star rating ratio:
(left - more overweighted; right - less overweighted) (Y axis = aim stars / speed stars)

PP values ratio:
(left - more overweighted; right - less overweighted) (Y axis = aim pp / speed pp)

The website I made a scatter plot on builds a trend line automatically. It shows that on average most overweighted maps have 5% more aim stars than speed stars, and least overweighted maps have 5% less speed difficulty than aim difficulty. In terms of pp - most overweighted stuff has 25% more pp for aim than for speed, least overweighted stuff has 5% less pp for speed than aim.

Conclusion

On average, best pp plays of most players have more aim difficulty than speed difficulty, by 5%. A lot rarer are plays that have 5% more speed difficulty. This in turn makes it so most overweighted maps give 25% more pp for aim than for speed. 25% makes a 600pp play into a 750pp play.
Keep in mind that on average all maps have 1:1 ratio of aim to speed difficulty (star rating).

What should we do? We should just buff speed star difficulty by 5%. Or nerf aim difficulty by 5%.
This would make it so on average ALL maps are 1:1 ratio between aim diff and speed diff. Even currently overweighted maps and currently underweighted maps.

I would love to implement this change locally and show you guys how it would affect some maps and some players, but I don't know how to do that. For now I'm only starting a discussion.

VINXIS commented 5 years ago

While i completely agree (and was planning the exact same thing) i think it's important to wait on this a bit until we've finished SR recalc and flow calc as well because just nerfing aim in general will cause a ton of maps that are underweighted in the first place to become underweighted even more.

I tested a few days ago on nerfing short angle jumps, and aim in general, and both did not seem to give satisfying results yet, once those 2 things (sr recalc and flow calc) are done tho i think it would be a good choice to try nerfing jumps

grumd commented 5 years ago

@VINXIS I'll recalculate this stuff as soon as SR changes are applied and oppai is updated then!

VINXIS commented 5 years ago

sounds g

just realized by sr recalc i mean the idea xexxar proposed to use per object data instead of chunks instead, both sr and flow calc would be after these changes that are being pushed soon are live, but it would still b a good idea to check with the new sr calcs with angles and speed

grumd commented 5 years ago

@VINXIS do you have links for these things you're talking about?

VINXIS commented 5 years ago

no there aren't any yet xexxar is planning to start working on those after afaik

grumd commented 5 years ago

Oh okay. Do you know if there's a way to track whether changes that are pushed live soon are actually live?

VINXIS commented 5 years ago

the dev discord has a ton of discussion daily regarding stuff otherwise i dunno

kk1995 commented 5 years ago

Thanks for the data. From this result, I can see that a section with 50 aim pp is easier for players than section with 50 speed pp (50 is an example value). However, I see two possible explanations for this.

One is that aim is overweighted. The other is that players trained themselves for aim more than speed, leading to some sort of feedback loop. Both are possible, so I cannot make a conclusion here.

VINXIS commented 5 years ago

I'd say the 2nd is what Can cause aim to be overweighted as well. It can be overweighted because of the idea that players train themselves more than speed, which I would anecdotally state to be highly possible. Regardless, because of that, aim can as a result become overweighted.

The pp system can only be good for an Interval of time as it has to be reactionary to the playing environment. If aim is what the easiest pp maps are, then aim should be nerfed, and others should be buffed (or a similar method). If the playing environment changes, then the pp system also needs to change as a result.

The problem currently is that by reducing aim or any type of jump at the current stage of the pp system (even with angles considered), u can still end up nerfing maps that are what the community deems underweighted. best idea rn would be to wait a bit until other stuff is implemented, proly dont even need to nerf aim by that point

PowerChaos125 commented 5 years ago

I am questioning the validity of your result. From what I see the 2 graph are exactly the same. To my understanding this can not be the case, as pp value are in the scale of power 3 of the strain value (i.e. y = x^3) , which then scale linearly with star rating. Can you provide a dump (~100) of map details used in your result?

grumd commented 5 years ago

@PowerChaos125 oh fuck, I used wrong image file xD

I edited my post to use the correct one.

Feuerholz commented 5 years ago

While the general tendency is obvious regardless, the visual representation is slightly misleading as the absolute value of the aim/speed ratio scales faster for high aim values than for high speed values. (Idk how to phrase this, basically if aim=4 and speed=2, it's 2.0, but if speed=4 and aim=2, it's 0.5, which makes it look like the first case is twice the difference compared to an even ratio than the second, even though in reality both should be the same distance from 1.0).

There's probably some way to fix this by applying a different scale but I'm not sure how rn, however I think it should be done to draw a better conclusion in regards to the severity of this.

EDIT: Could use aim/speed - speed/aim for the y axis, still wouldn't be linear but the distortion would be equal for both directions.

grumd commented 5 years ago

@Feuerholz your solution is called a logarithm scale, but we don't really need it. The only thing really meaningful in this graph is that the line isn't horizontally flat at 1.0. And its coefficient is 1.05 for star rating and 1.24 for pp values. You just shouldn't really pay any attention to the fact that dots go higher above than below. It doesn't really matter.

Francesco149 commented 5 years ago

@grumd since you used oppai, if you wanna test the upcoming pp changes with this method I have a branch that implements everything, I tested a few scores and it seems to match the xexxar-all site pretty closely but there might still be bugs https://github.com/Francesco149/oppai-ng/pull/40

you can find windows binaries in the appveyor artifacts if you don't wanna compile it: https://ci.appveyor.com/project/Francesco149/oppai-ng/builds/21409045/artifacts

Feuerholz commented 5 years ago

@grumd Oh log works for that? Of course it does thanks for not functioning brain

The reason I think it might be relevant is that if there is a larger amount of dots below than above the 1 line, but closer to it on average so as to make the line have the tendency that it does, the issues may be more complex than just "aim is overrated", i.e. would mean that heavily aim biased maps are more likely to be overrated, but the average map (not even farm map) is slightly speed biased.

More than that it'd also help visualize the extremity of outliers to both extremes. Both could be easily inferred from tables too, though, but a normalized scale would still be helpful for outliers.

grumd commented 5 years ago

@Francesco149 thanks for this! Using your oppai binaries, here's the graph for aim stars / speed stars.

This is using the same exact maps that previous graphs used. We can see that underweighted maps have their speed values buffed, but aim values are buffed overall too. Now average aim/speed ratio is 1.13-1.12 instead of ranging from 1.05 for overweighted maps to 0.95 for underweighted maps. One important thing: just using new oppai isn't enough, obviously. I need to recalculate my overweightness values. I still need to wait for osu main website to update, so I can use the API to recalculate everything on this website first https://grumd.github.io/osu-pps/#/maps. And then I'll be able to tell which maps are the most overweighted and least overweighted after the changes. And only then I'll build a new graph that makes sense. Graph above means nothing so far. And after that we probably have to wait for a few months for people to find which maps are overweighted with new pp values and start playing them more. After that my website will get update to mirror this.