llllllllll / slider

Utilities for working with osu! files and data
https://llllllllll.github.io/slider/index.html
GNU Lesser General Public License v3.0
39 stars 16 forks source link

update star/PP calculations for Feb 2019 update #39

Open llllllllll opened 5 years ago

llllllllll commented 5 years ago

The PP and star calculations were updated in February 2019. Slider needs to be updated to account for these changes.

source: https://osu.ppy.sh/home/news/2019-02-05-new-changes-to-star-rating-performance-points

blimmo commented 5 years ago

Would it perhaps be a good idea to use pyttanko here instead?

It seems to get updated after changes pretty quickly so would stop us having to write them into slider.

I'll happily write a PR for it if you think that's a good idea.

llllllllll commented 5 years ago

I hadn't seen that library. After giving it a quick glance, I would prefer to leave our implementation, but maybe add tests to compare the results with this. That code doesn't really appear to be designed for use as a library. There isn't a way to specify a beatmap as an in-memory object, you need to pass the beatmap file as a file-like object. Slider doesn't always have access to a beatmap file as text, for example when we download a new beatmap without save=True, so we couldn't call into pyttanko there.

The other main issue is that it is just a bunch of pure Python iterative code without any use of numpy. slider was designed to support https://github.com/llllllllll/lain and other medium to large scale processing tasks. While our parser is currently very simple and slow (and should be replaced), the PP calculations are pretty fast. Importantly, it is also fast/convenient to compute the performance points for many different results at the same time. For example https://github.com/llllllllll/lain/blob/master/lain/error_model.py#L699-L751 generates 1000 candidate plays by drawing from a probability distribution of results on each hit object. Then, each length 1000 vector of 300, 100, 50 counts is passed to the performance_points method which results in a length 1000 vector of the corresponding PP for each play. Importantly, this isn't just looping over the PP function 1000 times with different parameters, it will re-use any common work (like star calculation) or sub-computation that is constant with respect to the actual counts.

blimmo commented 5 years ago

Good point wrt the numpy usage but I think you could pass a list of line strings (so just the contents of the files .split('\n')ed) instead of a file object since it doesn't call open.

Also looking at its tests it looks like you can reuse beatmap data calculation like stars etc. by reusing the beatmap and stars objects so if we cached the pyttanko beatmap and stars objects in our beatmap object we'd only have to calculate that once.

The library is written kinda weirdly though (because the author mostly programs in C i think) so it's probably fair enough to try to keep the current pp implementation in slider especially if we can get it optimised with numpy.

jxu commented 5 years ago

It should be documented which version of PP update is used, since I can't keep track of all of them. I support the idea of leaving PP calculation to a more frequently updated library but if it doesn't integrate well we should keep this one.

jxu commented 4 years ago

is it simply https://github.com/ppy/osu-performance/blob/master/src/performance/osu/OsuScore.cpp ?