ppy / osu-performance

Calculates user performance aggregates from scores
GNU Affero General Public License v3.0
242 stars 45 forks source link

Taiko Star Rating/Performance System Suggestions #61

Open Alchyr opened 6 years ago

Alchyr commented 6 years ago

Sorry if my wording is a bit awkward, I don't really write long formal explanations like this often.

Alright, so first of all, the issues with taiko star rating/pp system right now -

So, I'll do a quick breakdown of what causes most of these issues. Taiko star rating is calculated through strain, similarly to standard. Each object decays in strain based on the distance from the previous object (this results in higher star rating on faster maps), adds a fixed amount of strain, and then other bonuses, of which there are two - color and rhythm, each of which has some problems.

So, first, rhythm. The rhythm bonus occurs any time the gap between two objects is different from the previous gap within a certain margin. If it changes by an amount that is not a multiple of two (1/2, 2x, 1/4, 4x, etc) a fixed bonus is added to strain. This results in doubles being overweighted, since it applies this bonus on every single note if they're spaced properly, and on 1/6+1/4 every time it switches between the two.

Second, color. The color bonus occurs any time a map goes from an even number of one color to an odd number of the other color, or vice-versa. Similarly to the rhythm bonus, the color bonus is a fixed value, which results in patterns that swap color in this way being worth much more than patterns that do not. This is what results in overweight on patterns such as kddkddk, as well as contributing to 1/6+1/4 patterns due to them often switching between even and odd numbers of objects due to the way the are usually constructed.

The way strain was calculated was fine at first, since most older taiko maps were mapped in a similar style to the official taiko games (I think? Don't quote me on this.), in which the kinds of patterns that would result in excessive boosting were relatively uncommon, and would probably be fine if it was for the actual taiko game, in which such patterns would likely be much more difficult, but with the way osu!taiko has developed, it's just not really accurate enough anymore.

So, proposals as to how to adjust these - For rhythm, rather than a fixed bonus regardless of change, different fixed degrees of change result in different fixed bonuses. Instead of speedup and slowdown being treated equally, slowdown will generally be worth less.

Color is a bit more complicated; Instead of only receiving a bonus when the number of objects changes from odd to even/even to odd, it occurs each time color switches, with the bonus being based on the number of objects of the same color, and reduced based on a few factors - even numbers of objects are worth less, and if the same number of objects is repeated multiple times (separate for each color) the value is decreased for each repetition.

For this, instead of going from start to end, it goes in order from the highest strain object to the lowest strain object in the entire map. For each object, it "dequalifies" all objects within a fixed distance from that object, so after getting the highest strain object at a point, all the nearby objects with similar but slightly lower strain would not be counted when it reaches them.

That's pretty much it for star rating, so next is some comments on how pp value of a map is calculated.

I haven't gone nearly as depth into how this is calculated, but I do have some things that I would suggest adjusting.

~- OD is worth way too much Right now, the difference in value between plays is way too severely affected by the OD of the map. As an example - 7 2500 object 99% acc at OD 7 = 385 pp 7 2500 object 99% acc at OD 10 (or 9.8, which is what you get with OD7 + HR) = 456 pp While the higher OD certainly does make it much more difficult to get such good accuracy, the difference right now is just a bit TOO high. As a much more extreme example - 3 2500 object SS at OD 4 is 167 pp, which is probably still reasonable. But, the same map at OD 10 (and yes I know a 3 with OD 10 would never get ranked) is worth 300 pp, which is way too much for a 3 star map. Just, overall, it needs a bit of toning down.~

Alright, after some thought, OD changes are probably unnecessary to make right away. With the way OD affects pp is calculated the other changes would already have a pretty large effect, so it can be seen if OD is still a problem later.

Explanation of overall results of these changes - 1/6+1/4 maps have their star rating decreased. It will still be relatively high, but this will be balanced out by the weighted object count which means that maps that only have a few short bursts of 1/6 inflating the star rating will be worth much less, while maps that use a large amount and are therefore more challenging should still be worth a reasonable amount.

Some extreme examples of technical maps will still be underweighted, and some probably more than before. (ex. https://osu.ppy.sh/s/742538), but it's hard to avoid this.

Speed maps are generally higher in star rating due to multiple factors (with bonus from rhythm and color reduced to some degree, more of the total value of maps comes from the base of just speed)

Due to reduction of bonuses from rhythm, certain maps where easier diffs have a higher star rating than the difficulty above them are mostly fixed. (ex. https://osu.ppy.sh/s/138886 where futsuu > muzu)

Probably some other effects, but I can't think of anything else important off the top of my head.

I'd appreciate it if you would comment thoughts, or any ideas of other improvements.

If you want to see an example of implementation - Github of visual studio project - https://github.com/Alchyr/taiko This is coded in Visual Basic, mostly for convenience of the data display, but the method of implementation should be compatible with how star rating is currently coded, it would just need to be adjusted to the correct language (and cleaned up a bit). While the star rating can be treated as relatively balanced, the pp value is mostly just experimental.

Direct download of tool - https://drive.google.com/file/d/1TefHv5g1UCYuFt0U09wHzMlS7e1hPJDz/view?usp=sharing This is just the compiled .exe from that project. It calculates both the old and new star rating as well as some other values for each taiko map in your osu!/songs folder. If you don't trust this, just use the other link and compile it yourself after checking the code if you want.

Well, thank you for taking the time to read this. Hopefully my explanations made sense to you and weren't too poorly worded or confusing.

Tom94 commented 6 years ago

Thanks a lot for the detailed write-up and experimentation! Your points make a lot of sense and I think they make the current system better. :)

If you are willing to implement your algorithm within lazer, let it run over all beatmaps (I believe @smoogipoo made a tool allowing you to do just that), and then run osu-performance using that difficulty data I'd be happy to replace the current algorithm with yours.

pmpmjones commented 6 years ago

Nice work, I definitely think this is an improvement over what we currently have now in the live client. I honestly feel like longer maps should get devalued when they mainly consist of filler with a hard section in the middle. OD also being heavily broken. A 3 map with OD 10 can award 300 pp like you suggested but it's even more absurd having an OD 10 + DT 1 map giving you 500 pp. The SR changes are for the better and I really think improving how pp is awarded is a must. If this is planned of getting implemented into the game, will converts still be a thing?

Raidencio commented 6 years ago

wow, props to you

TooruAkagi commented 6 years ago

I'm Toorun12, one of osu!taiko players. I want to express my opinion. still have no power of discussion, I will say referring to the top 10 players record. I think , looking down on the high OD is not preferred. Look at this :

  1. shinchikuhome / HD:0 HR:3 EZ:2
  2. GNKait / HD:91 HR:58 EZ:7
  3. n1doking / HD:73 HR:44 EZ:5
  4. ekumea1123 / HD:61 HR:12 EZ:14
  5. kiyozi11 / HD:29 HR:1 EZ:4
  6. applerss / HD:68 HR:0 EZ:0
  7. tasuke912 / HD:73 HR:0 EZ:38
  8. shakeitdance / HD:4 HR:0 EZ:3
  9. kei821 / HD:27 HR:12 EZ:8 10._Rise / HR:99 HR:18 EZ:12

it is What I counted the number of HD, HR and EZ of 10 top players, listed in "Best Performance". Obviously HD is higher than HR, only 1st player, "Shinchikuhome" is the exception. also 11th--100th player usually tends to use HD than HR. rather, often EZ is used more times than the one of HR. This clearly indicates that high OD is not a valid mod to earn pp. please, please reconsider, and rather give "high OD" preferential treatment than HD.

I hope to make a good environment to earn pp by making use of more various mods.

That's all, I am not a strong player, but I am one of players in osu!taiko. I'm glad if you can use it as a reference.

ARGENTINE-DREAM commented 6 years ago

Everything about SR is better than current system. But there's a major problem with the high BPM buff, converts which are not designed for the gamemode with extreme stream patterns which are easily performed by special methods (TL beating, etc) will be extremely overweighted as well. So now that we are talking about PP as well, you should include a rule that states maps whose gamemode is equal to "0" have their PP value nerfed for Taiko.

moai199733 commented 6 years ago

Considering it’s a music game, further reducing the proportion of accuracy from now seems unreasonable. Additionally, changing SR and OD at the same time, perhaps can cause terrible collapse of pp balance.

pmpmjones commented 6 years ago

Justifying buffing OD because top players don't play HR as much as HD is because HR isn't hard because of the OD, it's hard and not used as much as HD because of how harder maps have higher bpm and higher SV, rendering HR not viable, especially when high sv HD is basically playing nomod. Most of top 10's players' plays are all high bpm which translates to higher sv. I believe OD just needs to be toned down just a little.

ekumea1123 commented 6 years ago

In this state, I think that HD is easier than HR. Despite of this, it is very unreasonable to decrease pp that can be earned by HR.

nyanmi-1828 commented 6 years ago

Regardless of difficulty (not only high difficulty), HR is definitely disadvantageous. so, it is necessary to try additional points by SV for HR. In addition, if you are improving the current pp system, you should give more high pp to high-density music from now.

nulltarou commented 6 years ago

By reducing the influence of pp by OD, players will no longer do HR. Furthermore, osu! Taiko is a rhythm game. In other words, it is fundamental to compete for accuracy. If you reduce the impact of pp by OD, accuracy will not be able to further affect pp. It is no longer a rhythm game. In general, HR is a mod that makes SV and OD difficult. However, in the current pp system only OD affects pp. That is, the SV is discarded. Rather, I think that pp with enhanced SV needs to be added.

mario1393 commented 6 years ago

It's a problem that EZHDDT get more pp than DT only, so I think EZ should be nerfed.

Darginn commented 6 years ago

I had an interesting thought myself. While they aren't required, of course I don't think they should be calculated in terms of star rating and such, however maybe sliders, spinners, and finishers could give a small pp bonus for hitting them, but of course nothing major but just a small reward other than score for hitting the optional notes.

A past comment noted that SV should be taken into account in terms of map difficulty, I agree, however the problem stands on by how much you want to weigh it along with a background of people who cheat using a very hard to detect manner called Dualscreening. While most players who do this are blatantly obvious, there are likely a select amount of people who know how to use this method effectively in a way that doesn't get them caught. Another conflicting factor involves the comfort zone of players, some can only read HR and some cannot read Nomod, don't even get me started on EZ. It would have to be scaled too since slow SV at times is extremely hard in itself. I would say if any of the variables were based on SV, it would have to scale up in 2 directions from a single point and also take object count into consideration. My thoughts are: Slow/fast sv with low object density is less straining while slow/fast sv with higher object count is more straining. The base or null sv would simply be the base 1.4x SV provided by the BPM. The final factor that could be considered in SV are changes mid-map which would likely have to be individually.

Personally i'd ask to make it that if mode=0 then pp=0 but that's just me hating converts.

My final comment is on OD, in terms of PP calculation I don't think it really needs much there, My opinion on OD is that the hit range itself needs a buff so that the massive bonuses are rightfully rewarded.

I hope some of my thoughts help, overall I applaud the work you're putting into this and hope it turns out well. I don't want to beat in the speed meta issue more than it already has, but it is currently something that might need a bit of tweaking still.

michael-reyfman commented 6 years ago

I think that in some cases OD should have an impact on star rating. One of example of this is smashing various 1/3+1/6+1/4 stream combinations as a single stream without putting a significant accent on rhythm. In cases with low OD, you can do that easily with a minimal number of 100s like in HELIX, where it's relatively easy to smash the 1/6 doubles like an ordinary 1/4 stream with NM, EZ, DT or EZDT but it's much harder to get all 300s with HR because high OD reduces the area of 300s and 100s. Therefore, HELIX should remain 5.5 with nomod and become ~5 with EZ, ~6-6.5 with HR, ~6.5-7 with DT, ~6.5 with EZDT and ~7.5-8 with DTHR. This can solve the problem of EZHDDT being worth more than DT only (like @mario1393 said).

About the SV affection, I think that if we want to implement this, we should also take into account the factor of memorization so a hypothetical map with SV 1000.00 (impossible to read by human) shouldn't reward more with FL/HDFL because it demands the same memorization skills as a map with an extremely low SV (0-0.1). Same goes for maps with an extremely changing SV (like donkama2000) so we need to calculate the so-called "memorization index" that varies from 0 to 1 with such points:

Having this index can help us to calculate PP more precisely for gimmick maps even with including HD/FL/HDFL. Let me show an example for an arbitrary 180bpm map:

SV 1.4: 5.20 SV 1.4+FL: 5.65 SV 1.4+HDFL: 5.90*

SV 2.8: 5.65 SV 2.8+FL: 5.90 SV 2.8+HDFL: 5.91*

SV 10: 5.92 SV 10+FL: 5.94 SV 10+HDFL: 5.95*

I hope that described above may help us to improve the system.