ppy / osu-tools

command line tools to get stuff done
MIT License
105 stars 78 forks source link

Reconsider mods filtering in `PerformanceCalculator` looking forward to Lazer #216

Closed minisbett closed 1 month ago

minisbett commented 1 month ago

Currently, the legacy mods are masked to only include "relevant" mods. By the current code, "relevant" is defined as in having an affect on the star rating. This causes trouble because some mods, whilst not changing the star rating, affect PP. Mods like hidden, who are crucial to include, are simply ignored.

This currently causes trouble with usage in combination with the website https://pp.huismetbenen.nl/ that is a key element in PP development. The maintainer currently runs a "fixed" version locally, although managing it is troublesome. Helix has tried PRing this before in #209 but I decided to pick it up in a separate PR to address it properly.

Instead, I decided to remove it completely since there's no hurt in inputting mods that don't affect PP into the calculator. It'd also mean that if changes happen, eg. HD affecting star rating through the introduction of a proper reading skill, there's no extra effort in having to keep osu-tools up with it.


Here's a simple example to see the change (example score is mrekk's save me, which is 1665pp):

Before

dotnet run -- simulate osu 1256809 -m HD -m DT -c 4292 -G 85 -M 1 → 1,577pp dotnet run -- simulate osu 1256809 -m DT -c 4292 -G 85 -M 1 → 1,577pp

After

dotnet run -- simulate osu 1256809 -m HD -m DT -c 4292 -G 85 -M 1 → 1,644pp dotnet run -- simulate osu 1256809 -m DT -c 4292 -G 85 -M 1 → 1,577pp

You can safely ignore the 1pp difference, that's normal due to the 1665pp being calculated through osu-performance and having tiny decimal differences on some scores.

minisbett commented 1 month ago

If there's any actual reason for the masking I'm overseeing (everything works, so I'm not sure what I could oversee) we can of course re-evaluate how this could be implemented. I saw smoogi suggesting some extra parameter, but I'm not sure how methodically correct it is to filter out mods like HD in the first place if they affect PP, as that value is much rather a focus than the SR is.

smoogipoo commented 1 month ago

We should instead filter mods by Mod.Ranked.

Implementation should be exactly the same as https://github.com/ppy/osu-queue-score-statistics/blob/02e1373442c394c9ca42c4c0a3520bfe3bdb18ce/osu.Server.Queues.ScoreStatisticsProcessor/Processors/ScorePerformanceProcessor.cs#L171-L183

Also, the mapping from DC -> HT should be able to be removed.

minisbett commented 1 month ago

I think it might be more handy to not filter the mods instead of only including ranked mods / mod combinations.

osu-tools is by far most used for PP development, and whenever a new mod(-configuration) is considered for development it'd mean you need to explicitly add exceptions for that.

I still think it's methodically more correct to just throw all mods into it and have it handle whatever mods the PP code already considers for, instead of constraining it to only include ranked mod combinations/configurations.

smoogipoo commented 1 month ago

Hmmm... I'm a little bit skeptical because this is also used when you're batching data, like via profile, where scores may be coming in that are totally irrelevant to the mods you're developing for. I'll step aside for now and assume that, because the scores are unranked and thus 0pp, they are unlikely to appear near the top of a user's leaderboard to affect the actual development process.

Just to hint at my thought process, I was initially going to respond with:

If you're developing PP for mods, then wouldn't you also be able to set Ranked = true at the same time? Because you'd be working with a local checkout of osu! anyway. Otherwise, maybe we can have a --assume-ranked option for the CLI that bypasses it?

smoogipoo commented 1 month ago

The other part of my comment still needs to be addressed though. This isn't enough to support lazer-first scores for as long as they're being remapped through LegacyMods.

Likely what you should do is restore the deleted method and keep the mapping only for what looks to be the single path using online attributes (and thus can be implemented locally):: https://github.com/ppy/osu-tools/blob/7c639757aeb8f3e35581e00a6b46c7d1022b26b2/PerformanceCalculator/Performance/ScorePerformanceCommand.cs#L49-L50

minisbett commented 1 month ago

Likely what you should do is restore the deleted method and keep the mapping only for what looks to be the single path using online attributes (and thus can be implemented locally)::

In that case, I think the masking would still need adaption to specifically include HD when FL is disabled too, as it is still being filtered under the idea that it does not affect star rating without FL.

Besides, shouldn't the filtering be removed here? I don't see why Lazer mods should be ignored completely on an online score. https://github.com/ppy/osu-tools/blob/7c639757aeb8f3e35581e00a6b46c7d1022b26b2/PerformanceCalculator/Performance/ScorePerformanceCommand.cs#L55

In general, I don't quite get why there's filtering happening in a lot of places when it no longer should, since some Lazer mods are now already considered by PP calculation. For example here, there's an argument to include/exclude CL, but I don't think this makes sense for the simulate command and instead CL should be able to be specified the same way as other mods, with -m CL. Right now, it seems like "No Classic Mod" means no filtering should happen, but the behavior of this feels weird to me.

https://github.com/ppy/osu-tools/blob/master/PerformanceCalculator/Simulate/SimulateCommand.cs#L60

My thoughts is:

  1. Remove the --no-classic parameter from the simulate command as classic should be able to be specified via -m CL
  2. Remove the filtering of mods for legacy-only mods on the simulate command as Lazer mods should be able to be specified similarily
  3. Remove filtering from the profile command
  4. I'm not sure about the practicality of the --online-attributes in the score command. Why would you want to assume live difficulty attributes instead of whatever happens locally (It'd possibly mix old difficulty attributes with a new PP calc code)? What if the difficulty attributes are not compatible due to the addition of new ones? Is that parameter necessary/what was the intention of implementation? My thought is that if specified, you would need to filter for legacy mods as the API only provides difficulty attributes for legacy mods. But for local difficulty attributes, I'd get rid of any filtering too.
minisbett commented 1 month ago

I've gone ahead and implemented my previously mentioned thoughts because I think it is easier to think about them when you see them in action/in code.

I've roughly tested everything and it works as expected, except for the legacy-score-attributes command where I either can't figure out what I'm seeing or it returns wrong/confusing values. But it does return the same before and after this change.

I can't seem to figure out why it returns 9.1 million for the combo score.

C:\Users\mini\source\repos\minisbett\osu-tools\PerformanceCalculator>dotnet run -- legacy-score-attributes 4801028
ruleset: osu

╔═══════════════════════════════════════════════════╤════╤══════════════╤═══════════╤═════════════════╤══════════════╗
║Beatmap                                            │Mods│Accuracy score│Combo score│Bonus score ratio│Mod multiplier║
╟───────────────────────────────────────────────────┼────┼──────────────┼───────────┼─────────────────┼──────────────╢
║4801028 - nekodex - circles! (Affirmation) [Insane]│    │       164.200│  9.140.592│         0,000000│        1,0000║
╚═══════════════════════════════════════════════════╧════╧══════════════╧═══════════╧═════════════════╧══════════════╝
smoogipoo commented 1 month ago

I've applied the changes to reduce the back and forth. Basically, the only path that matters is the one I mentioned, which was useful in the past to replicate the exact calculations done by the online pp processor. Of note, until https://github.com/ppy/osu-queue-score-statistics/pull/274 is merged, the processor will still only be using databased attributes.

In that case, I think the masking would still need adaption to specifically include HD when FL is disabled too, as it is still being filtered under the idea that it does not affect star rating without FL.

Be aware that until https://github.com/ppy/osu-queue-score-statistics/pull/274 is merged HD+(non-FL) combinations will also be filtered out. I.e. the following will still be in effect:

image

If that's an issue (and you expect HD on its own to have difficulty attributes in the next deploy) then we will have to fix that.

smoogipoo commented 1 month ago

I can't seem to figure out why it returns 9.1 million for the combo score.

It's expected, that's how much is attributed to the SV1 score by combo. NM SS on that beatmap is 9,304,792.

Those two commands are a little bit misleading - they're computing total score not pp.

minisbett commented 1 month ago

Be aware that until ppy/osu-queue-score-statistics#274 is merged HD+(non-FL) combinations will also be filtered out. I.e. the following will still be in effect

That seems alright for now, as the filtering now only happens for the API request, instead of also passing those filtered mods to the performance calculator. HD won't have difficulty attributes in the next deployment and it'll probably be some time until it might.