UltraStar-Deluxe / USDX

The free and open source karaoke singing game UltraStar Deluxe, inspired by Sony SingStarâ„¢
https://usdx.eu
GNU General Public License v2.0
856 stars 161 forks source link

[Feature Request] randomness not random enough #186

Closed basisbit closed 3 months ago

basisbit commented 7 years ago

There were several reports about the randomness of the usdx random song selection supposedly not being close to an even frequency distribution. People also reported that certain songs repeated again and again. In this issue I want to check if these reports are just random reoccurring of numbers (a random selection between 1 and 5 can also produce 5 times a 2 in a row, that is still random - but quite unprobable).

You can help by opening your song selection screen, pressing r and then writing down what song numbers were "randomly" selected. Repeat that a couple of times and post your collected numbers here as a reply. Thanks!

basisbit commented 7 years ago

some numbers that I just got from usdx 1.3.5 with 730 songs loaded: 608, 469, 178, 135, 338, 444, 592, 220, 111, 703, 514, 142, 157, 138, 336, 273, 282, 537, 99, 276, 677, 381, 72, 337, 74, 431, 486, 329, 476, 115, 296, 619, 516, 594, 527, 458, 399, 670, 183, 49, 479, 673, 394, 204, 90, 628, 163, 307, 618 -> 49 numbers. No duplicates.

RattleSN4K3 commented 7 years ago

I can confirm the not-much-random-random algorithm but only in 1.1.0. Haven't played much lately using the random feature. The problem can be noticed when using the random category feature (less items).

The question I have when thinking of random, what kind of mode is preferred. 1 of ALL consecutively (may result into dupliates), or 1 of (ALL-{PREVIOUS}) which may be unwanted when good random songs are not randomly appearing. For instance, Winamp has a slider for the intensity of the randonmess having you choose how you prefer it. A stricter search may be more time consuming with a database of thousands of songs.

shaula commented 7 years ago

Randomness in song selection was not an issue for me within UltraStar Deluxe 1.3, but always has been in team duells. The algorithm often tends to pass the mic between 2 persons of a 3-player-team, often limiting the third member to only one turn. I'd suggest using some solution where the number of turns affects the randomness. This should ensure that all members of a team got a similar number of turns. Maybe its also fine to make the player selection deterministic (and dropping the randomness there).

s09bQ5 commented 6 years ago

Btw., Randomize on Linux uses a low resolution clock to set randseed. So since b0156df49ef9d25d6aca23df5517913641c99b92 'R' will always take you to the same song if you press the key more than once in a second.

basisbit commented 6 years ago

So since b0156df 'R' will always take you to the same song if you press the key more than once in a second.

Definitely not a bug in usdx

pb-programmer commented 5 years ago

Sorry for raising this thread again, but I experienced the poor random behaviour myself. After digging into the code I actually think b0156df49ef9d25d6aca23df5517913641c99b92 made it worse and should be reverted. Repeatedly seeding the PRNG lowers the actual randomnes of returned values. From the FPC wiki:

FPC's random number generator has to be initialized with a single call of the randomize function

This is already done in the TScreenSong.Create method.

Background: FPC uses a mersenne twister as PRNG and while MTs are state of the art and feature a very long period, properly seeding them is a problem. They require 19937 bits of randomness to be properly seeded. Most implementations use some sort of unix timestamp mangling to fill the state bits which obviously delivers far less randomness. Properly used that's okay since MTs "recover" from poor seeding over time (every random number generated increases the randomness of the next one) but it becomes a problem if the MT state bits are constantly reseeded with poor quality randomness.

plodocus commented 5 years ago

@pb-programmer Thanks for looking into this! This bad randomness has been bugging me, too. You could make a pull request so that the maintainers of this repo are alerted.

pb-programmer commented 5 years ago

@DanBenHa I'm still not confident enough I have a proper fix to make a pull request. I'm currently looking into this problem and tinker around with the code a little bit. But be aware: I have no experience with object pascal whatsoever and started digging into usdx just a couple of weeks ago, so I have a steep learning curve ahead of me. Don't know if I can fix this anytime soon, it's more of a trial and error process ;)

I'm pretty sure the underlying cause of the problem is a lot deeper than the one line fix I mentioned. I played USDX at a party this weekend with the fix applied and the "real life randomness" still seemed buggy . On the other hand repeatedly hitting 'R' to test things seemed perfectly fine, so it's probably some other part of the code interfering with the PRNG.

s09bQ5 commented 5 years ago

I also believe that periodic reseeding produces worse random numbers.

The German Wikipedia links to a paper that has evaluated (in section 7) how long some random number generators are biased if their state is initialized with a single set bit. But even if we set system.randseed to 1 it will not be that bad, because Free Pascal uses the initialization function proposed by the authors of the algorithm, which fills the 19968 state bits from 32 bits of input using a (modified) LCG PRNG.

To get rid of the bias we should maybe follow Wikipedia's advice and discard the first ~700000 random numbers instead of repeatedly calling Randomize.

plodocus commented 5 years ago

Yes I guess seeding only once per game would be more elegant than the solution in my pull request #485 that uses /dev/random in UNIX systems. So we need Randomize during start up of the game and get rid of all other calls to it? I'll look into that :)

pb-programmer commented 5 years ago

Yeah, I tried that as well. Removed all Randomize; in all files and added it once at startup (UMain.pas) and it seems fine.

But after contemplating all of this a bit: I think the real problem is we actually don't want random song selection at all. It's similar to the birthday paradox where repeatedly polling random numbers out of a limited set will yield duplicates way earlier than the human mind thinks is appropriate.

In my opinion the desired behaviour is:

When I'm playing with other people they always want to see what is in the library first. They random until they find a song they like. If they don't know a song they want something completely different and if they random an artist they like, they'll search around a bit what other titles there are.

So in my opinion the perfect random function would be more of a quasirandom-sequence and songs that have been "randomed away" or that have already been sung should not appear a second time. That would require a lot of tinkering in the background and I'm not sure if this is the desired behaviour for others as well and worth implementing. But with a "hidden playlist" that is randomized with a quasirandom sequence during startup it should be relatively easy to implement. What are your thoughts on this? A path worth exploring or a wild-goose chase?

plodocus commented 5 years ago

@pb-programmer Your analysis and suggestion are spot-on! I agree that even if we have true randomness we might not really like it.

Would this be an algorithm in line with what you propose?

  1. At game start create a "hidden playlist" using all songs randomly shuffled.
  2. Mask songs played in current session from that playlist.
  3. On 'R' press jump to first unmasked song of the list.
  4. If this is followed by another 'R' press mask the not-chosen song (or entire artist) from "3.". Maybe mask other songs in the view from "3." (if the view was a list).
  5. On complete masking of the list, unmask songs and reshuffle it.

A problem with this algorithm would be that it might not be desirable to completely mask a not-chosen song or the entire artist. Instead of masking we could also set low probabilities that "recover" over the course of the session.

I'd love to implement this but I don't really know Pascal. I can give it a shot, though.

irgendwr commented 5 months ago

I agree with @pb-programmer, and I think that something similar to the algorithm described by the Spotify engineers might be better than "real" randomness: https://engineering.atspotify.com/2014/02/how-to-shuffle-songs/