[osu!std] Approximate amount of misses + sliderbreaks in a play

ppy / osu-performance

Calculates user performance aggregates from scores

GNU Affero General Public License v3.0

241 stars 45 forks source link

[osu!std] Approximate amount of misses + sliderbreaks in a play #131

Closed stanriders closed 3 years ago

stanriders commented 3 years ago

This is a port of misses + sliderbreaks approximation that's being used in delta_t's pp rework. It guesses amount of sliderbreaks based on achieved combo. This change lacks delta's continuous miss curve (I'm flooring miss amount here) due to fact that its not being used in any calculation that requires anything more than an integer

Example miss-per-achieved-combo graph for a beatmap with 1000 max combo and 200 sliders

Part of https://github.com/ppy/osu/pull/11514

abraker95 commented 3 years ago

The algorithm is probably not doing what you think it is doing.

Let's keep in mind scoring in osu! counts slider breaks as a 100 score, and not as a miss. Let's use your example of 1000 max combo and 200 sliders, and suppose a player gets 400 combo with the score showing 1 miss. You run this algorithm and it gives you a resulting 2 slider breaks. While that is possible, I don't believe that is intended because this implies that there are 3 points in the play that may have reset combo, 2 of which are known given the information (1 miss, 1 slider break), and one uncertain (1 slide break). That one slider break is uncertain because it may or may not have happened: you can still get 400 combo and 1 miss if you have 2 slider breaks in a row, for example, but also if you have only 1 slider break in the play.

Combo contains information about misses and slider breaks, and since this approximation uses just combo, it gives you an approximated number of misses and slider breaks. Misses contain information about misses only and not slider breaks. So to get slider breaks you do sliderbreaks = comboBasedMissCount - _numMisses. I believe this was intended to calculate the minimum number of slider breaks possible.

Xexxar commented 3 years ago

The algorithm is probably not doing what you think it is doing.

Let's keep in mind scoring in osu! counts slider breaks as a 100 score, and not as a miss. Let's use your example of 1000 max combo and 200 sliders, and suppose a player gets 400 combo with the score showing 1 miss. You run this algorithm and it gives you a resulting 2 slider breaks. While that is possible, I don't believe that is intended because this implies that there are 3 points in the play that may have reset combo, 2 of which are known given the information (1 miss, 1 slider break), and one uncertain (1 slide break). That one slider break is uncertain because it may or may not have happened: you can still get 400 combo and 1 miss if you have 2 slider breaks in a row, for example, but also if you have only 1 slider break in the play.

Combo contains information about misses and slider breaks, and since this approximation uses just combo, it gives you an approximated number of misses and slider breaks. Misses contain information about misses only and not slider breaks. So to get slider breaks you do sliderbreaks = comboBasedMissCount - _numMisses. I believe this was intended to calculate the minimum number of slider breaks possible.

It runs a max between miss count and slider count, so in your example it'd report only 2 misses, not 3. I thought the same thing at first till I checked the code.

abraker95 commented 3 years ago

It runs a max between miss count and slider count, so in your example it'd report only 2 misses, not 3. I thought the same thing at first till I checked the code.

My point is that if it's calculating slider breaks, it's not doing it properly. Misses are not slider breaks.

_effectiveMissCount = std::max(_numMiss, static_cast<s32>(std::floor(comboBasedMissCount))); This runs a max between miss count and the approximated miss count, not slider count. beatmap.NumSliders() is slider count. In fact according to what I said, comboBasedMissCount should always be equal or greater than _numMiss, so taking a max between the two makes no sense.

I have written up a doc that derives the formula for calculating slider breaks here

joseph-ireland commented 3 years ago

The PR is approximately right, but could be more documented and if you're discretizing it, there's no need for the std::pow stuff that gets floored to 0.

abraker, you've made some incorrect assumptions and mistakes in your calculations.

1) The code isn't calculating slider break count, it's calculating the minimum possible number of combo breaks (including both misses and slider breaks), with a bit of leniency for dropped slider ends.

2) Of course you can have more than the minimum possible for a given max combo, you could have 10 misses at the end of a map.

3) Didn't read the whole paper, but the end result is wrong, you've got 60% combo = 0 combo breaks in your desmos link.

The "theoretical" result if you want minimum possible breaks for given max combo (ignoring slider ends) is

std::floor(beatmapMaxCombo/(scoreMaxCombo+1))

To see why, consider that each combo has a miss at the end (except maybe the last). One case with minimal miss count would be if every miss is after a combo with length maxCombo, with a smaller combo at the end to finish the map. The number of misses/combo breaks is then the number of (maxCombo+1) sections that fit in the map, i.e. the result above.

bdach commented 3 years ago

I think @joseph-ireland is correct in saying that @abraker95's formula is relying on a faulty assumption. After reading through the attached doc I think the error is not including the misses in each combo streak section. In fact, if I correct @abraker95's initial assumption to include the fact that a play with N segments can have either N-1 or N breaks, I get two results, the lower bound of which is given by @joseph-ireland.

2020-12-19-134647_733x251_scrot

For an example of the worst case, consider a play on an 18-hitobject map with 2 streaks of 8 hits followed by a miss.

joseph-ireland commented 3 years ago

BTW, the continuous version in delta_t's rework can be justified like this:

Assume optimistically that that their best combo is their average performance on the map, i.e. they miss 1/(maxCombo+1) notes. Then on average they'd miss beatmapMaxCombo/(maxCombo+1) notes over the whole map, the same as above without the floor in there.

This gives fairly similar results, but it doesn't give discrete jumps at certain combo values, which seems a bit weird.

It does (usually) give you a non-integer estimate of the number of misses though, which could also seem weird

abraker95 commented 3 years ago

@joseph-ireland

The code isn't calculating slider break count, it's calculating the minimum possible number of combo breaks (including both misses and slider breaks), with a bit of leniency for dropped slider ends.

See, I thought it was supposed to calculate slider break count because @stanriders wrote:

This is a port of sliderbreak approximation that's being used in delta_t's pp rework. It guesses amount of sliderbreaks based on achieved combo.

But judging by the comment in the code, I think you are right and it's supposed to be misses + slider breaks. In that case the "Combo breaks to slider breaks" section in my doc is irrelevant to this.

@joseph-ireland

Didn't read the whole paper, but the end result is wrong, you've got 60% combo = 0 combo breaks in your desmos link. The "theoretical" result if you want minimum possible breaks for given max combo (ignoring slider ends) is

std::floor(beatmapMaxCombo/(scoreMaxCombo+1))

To see why, consider that each combo has a miss at the end (except maybe the last). One case with minimal miss count would be if every miss is after a combo with length maxCombo, with a smaller combo at the end to finish the map. The number of misses/combo breaks is then the number of (maxCombo+1) sections that fit in the map, i.e. the result above.

This took a bit to wrap my head around, but I see what you mean. I made a mistake running with scoreMaxCombo being perfectly divisible by beatmapMaxCombo, which is rarely the case. Basically I was working with

instead of

@bdach

I think @joseph-ireland is correct in saying that @abraker95's formula is relying on a faulty assumption. After reading through the attached doc I think the error is not including the misses in each combo streak section. In fact, if I correct @abraker95's initial assumption to include the fact that a play with N segments can have either N-1 or N breaks

Right, I assumed if there is a combo break then the segment must split, which turns out to be wrong in the case of first and last segment.

Since the doc is a good starting point for documentation, I'll make corrections to it. I'll also enable ability for others to make suggestions in it in case other mistakes are found.

joseph-ireland commented 3 years ago

Sorry maybe the last comment wasn't too clear, I read through your corrected document, and I'm not really following the reasoning in there. You still seem to be assuming that the sum of all combos has to equal the maxCombo, which isn't true (misses take away from potential combo), and I'm not following the reasoning after that point.

In my example above, the +1 comes from the note that gets missed. For a given combo of length C with a miss at the end, the actual number of notes involved is C+1.

For min number of misses you can assume all combos are length maxCombo (except the last with no miss at the end, with 0 <= length <= maxCombo). This means there are a number of (maxCombo+1) length sections in there, with maxCombo hits followed by a miss. For your picture, i think the X needs to have a bit of width to show it takes up some space. Once you've got that far I think it's pretty clear that the number of those sections with a miss at the end is (beatmapMaxCombo/(soreMaxCombo+1)), and you round down because any remainder corresponds to the final smaller combo which won't have a miss at the end.

FWIW for the formulation you created with the sum of all combos being a constant, i think your previous document wasn't far off, i think you needed a ceil rather than a floor, i.e. ceil(beatmapMaxCombo/scoreMaxCombo)-1, which ends up being pretty similar.

In that case, you are just calculating (number of combos - 1), you need to round up since the final smaller combo still counts.

abraker95 commented 3 years ago

Sorry maybe the last comment wasn't too clear, I read through your corrected document, and I'm not really following the reasoning in there. You still seem to be assuming that the sum of all combos has to equal the maxCombo, which isn't true (misses take away from potential combo), and I'm not following the reasoning after that point.

In my example above, the +1 comes from the note that gets missed. For a given combo of length C with a miss at the end, the actual number of notes involved is C+1.

Ohhh, ok. I totally missed that the combo number doesn't include the missed note.

You still seem to be assuming that the sum of all combos has to equal the maxCombo, which isn't true (misses take away from potential combo), and I'm not following the reasoning after that point.

Yea, so it's not really combo I though I was dealing with. If a section is defined to be notes in combo + missed note, then what I wrote would sorta make sense.

For min number of misses you can assume all combos are length maxCombo (except the last with no miss at the end, with 0 <= length <= maxCombo).

As I went over things again, I found out that assuming all combos to be length maxCombo only works if scoreMaxCombo is perfectly divisible by beatmapMaxCombo. Otherwise you have a section at the end of equal to mod(beatmapMaxCombo, scoreMaxCombo). I have detailed how to account for that in the doc.