Badel2 / slime_seed_finder

https://badel2.github.io/slime_seed_finder
GNU General Public License v3.0
49 stars 4 forks source link

Brute force the upper 16 bits of the seed (in slime chunks mode), assuming Minecraft generated the seed? #1

Closed leijurv closed 5 years ago

leijurv commented 5 years ago

Here https://www.reddit.com/r/technicalminecraft/comments/7idgzx/seed_reverse_engineering_survey_of_approaches_and/dqzczmc/ you stated The problem is that the higher bits (48...29) need even more slime chunks, so we need to combine this strategy with structure finding, biomes, or other stuff, like the color of the sheep wool (not seriously, but that's a funny bug)..

This is true, you can't figure out the upper 16 bits if the seed is a random 64-bit number. For this to happen, you'd need to generate a truly random 64-bit number and set it as the seed. In that scenario, you would need to check all 65,536 possible seeds against the world as you see it, potentially just by generating that number of chunks and comparing them, it would be reasonably fast, but it can be faster!

If the seed itself was generated normally, by Minecraft itself, the generation of the seed (including the ignored upper 16 bits) was also done by util.Random, which also ignores the upper 16 bits of its input.

It's generated by (new Random()).nextLong(). Internally, this is two successive calls to next(32), returning an int, combined into a long. That function simply returns the middle 32 bits of the seed (as in, a call to next(32) does seed=seed*multiplier +addend; seed = seed & mask; return seed >>> 16). We know the lower 48 bits of the result of two successive calls to this. As in, we know the lower 16 bits of seed0 >>> 16 and we know all 32 bits of ((seed0 * multiplier + addend) & mask) >>> 16. Therefore, we can take the lower 32 bits of our partial seed, shift it left by 16 bits, and try all possibilities for those lowest 16 bits. For each one, we can check if it's plausible by inverting the random seed step function, taking us from(seed0 * multiplier + addend) & mask (not shifted right by 16!!) back to seed0 (or the lower 48 bits of it). Then, we need to take the middle 16 bits of the 48 bit seed0. Why? Because seed0 >>> 16 has the same lower 16 bits as the uppermost known 16 bits of our partial seed! As above, we know that our 48-bit partial seed's uppermost 16 bits are the lower 16 bits masked out of our first call to next(32). And if they are truly equal, we've found a candidate for the correct upper 16 bits (there's some randomness, but we are guaranteed to find at least one if this seed was truly generated by Random.nextLong).

You can invert (seed0 * multiplier + addend) & mask (more or less) like so: ((seed - 11) * 246154705703781L) & MASK_LOWER_48_BITS source (information is lost in this multiplication but since the lower 48 bits are masked both ways, this does actually invert it properly)

Tying it all together, if (partialSeed >> 32) & MASK_LOWER_16_BITS == (((((((partialSeed & MASK_LOWER_32_BITS) << 16) | candidateHighestSixteenBits) - 11) * 246154705703781L) & MASK_LOWER_48_BITS) >> 16) & MASK_LOWER_16_BITS, then candidateHighestSixteenBits << 48 | partialSeed is a possible seed.

On average it will find just one, and it's guaranteed to find at least one if the seed was generated normally randomly, but it could find more, those will need to be checked manually.

The method is here https://github.com/pruby/slime-seed/blob/master/expand48to64random.c

leijurv commented 5 years ago

After some more analysis, I have found that the above is wrong, and that the C code from slime-seed is wrong half the time.

The issue is in Java itself:

    public long nextLong() {
        // it's okay that the bottom word remains signed.
        return ((long)(next(32)) << 32) + next(32);
    }

It's NOT okay that the bottom word remains signed. I found that when the second call to next(32); provides a negative integer, then the 48 to 64 expander in slime-seed is incorrect.

Here is the current code that only succeeds if second >= 0

    public static void main(String[] args) {
        for (int i = 0; i < 100; i++) {
            long seed = System.nanoTime();
            long scrambled = (seed ^ 0x5DEECE66DL) & MASK_LOWER_48_BITS;
            scrambled = (scrambled * 0x5DEECE66DL + 11) & MASK_LOWER_48_BITS;
            int first = (int) (scrambled >>> 16);
            scrambled = (scrambled * 0x5DEECE66DL + 11) & MASK_LOWER_48_BITS;
            int second = (int) (scrambled >>> 16);
            long prediction = ((long) (first) << 32) + second;
            long rand = (new Random(seed).nextLong());
            System.out.println(rand + " " + prediction + " " + (rand == prediction));
            System.out.println(first + " " + second);
            if (second < 0) {
                continue;
            }
            long partial = rand & MASK_LOWER_48_BITS;
            System.out.println((rand >> 48) & MASK_LOWER_16_BITS);
            List<Long> reversed = reverse(partial);
            System.out.println(rand + " " + reversed);
            if (!reversed.contains(rand)) {
                throw new IllegalStateException("Unable to invert " + rand);
            }
        }
    }
    public static final long MASK_LOWER_48_BITS = (1L << 48) - 1;
    public static final long MASK_LOWER_32_BITS = (1L << 32) - 1;
    public static final long MASK_LOWER_16_BITS = (1L << 16) - 1;

    public static List<Long> reverse(long partial) {
        List<Long> found = new ArrayList<>();
        long low = (partial >> 32) & MASK_LOWER_16_BITS;
        long high = partial & MASK_LOWER_32_BITS;
        for (int i = 0; i < (1 << 16); i++) {
            long prev = invert((high << 16) | i);
            long lastOutput = prev >> 16;
            if (low == (lastOutput & MASK_LOWER_16_BITS)) {
                System.out.println(i);
                found.add(((long) lastOutput << 32) | (high));
            }
        }
        return found;
    }

    public static long invert(long l) {
        return ((l - 11) * 246154705703781L) & MASK_LOWER_48_BITS;
    }
leijurv commented 5 years ago

Okay, the effect of the second partial integer being negative is that half the time, the middle 16 bits from the 48 bit partial seed is one lower than it should be (for reconstruction purposes).

We simply need to add a check like so:

            if (low == masked) {
                found.add(((long) lastOutput << 32) | (high));
            }
            if (low + 1 == masked) {
                lastOutput--;
                found.add((((long) lastOutput << 32) | (high)));
            }

However, even more stress testing found that this makes it work much better than 50% of the time, but not quite 100% of the time. It actually fails on one in every 65,536 seeds since low + 1 can be 2^16 while masked was masked to zero, so the second check actually needs to be if (((low + 1) & MASK_LOWER_16_BITS) == masked) {.

After some cleanup and refactoring, the final working code is this:

    public static void main(String[] args) {
        int lengthTotal = 0;
        for (int i = 0; i <= 1000000; i++) {
            long rand = new Random().nextLong();
            long partial = rand & MASK_LOWER_48_BITS;
            List<Long> reversed = reverse(partial);
            if (!reversed.contains(rand)) {
                throw new IllegalStateException("Unable to invert " + rand);
            }
            lengthTotal += reversed.size();
            if (i % 1000 == 0) {
                System.out.println(i + " Average size: " + (lengthTotal / (float) i));
            }
        }
    }
    public static final long MASK_LOWER_48_BITS = (1L << 48) - 1;
    public static final long MASK_LOWER_32_BITS = (1L << 32) - 1;
    public static final long MASK_LOWER_16_BITS = (1L << 16) - 1;

    public static List<Long> reverse(long partial) {
        List<Long> found = new ArrayList<>();
        long upperNormal = partial >> 32 & MASK_LOWER_16_BITS;
        long upperOffset = upperNormal + 1 & MASK_LOWER_16_BITS;
        long lower = partial & MASK_LOWER_32_BITS;
        long middle = lower << 16;
        for (int i = 0; i < 1 << 16; i++) {
            long lastOutput = (((middle | i) - 11) * 246154705703781L & MASK_LOWER_48_BITS) >> 16;
            long masked = lastOutput & MASK_LOWER_16_BITS;
            if (upperNormal == masked) {
                found.add(lastOutput << 32 | lower);
            }
            if (upperOffset == masked) {
                lastOutput--;
                found.add(lastOutput << 32 | lower);
            }
        }
        return found;
    }

I've tested it fully up to a million seeds, and it is able to successfully reconstruct the upper 16 bits every time.

On average, it finds ~2.210 possible values for the upper 16 bits. (across 1,000,000 runs it found 2,210,421 seeds).

leijurv commented 5 years ago

Just had a breakthrough way to improve it. If the second int is negative, obviously this means that the 31st bit of your partial seed will be turned on, because of two's complement, and how an int is sign extended when added to a long. We can extract the 31st bit from partial, and avoid the two seeds mess above.

    public static final long MASK_LOWER_48_BITS = (1L << 48) - 1;

    public static void main(String[] args) {
        int lengthTotal = 0;
        for (int i = 0; i <= 1000000; i++) {
            long rand = new Random().nextLong();
            long partial = rand & MASK_LOWER_48_BITS;
            List<Long> reversed = reverse(partial);
            if (!reversed.contains(rand)) {
                throw new IllegalStateException("Unable to invert " + rand);
            }
            lengthTotal += reversed.size();
            if (i % 1000 == 0) {
                System.out.println(i + " Average size: " + (lengthTotal / (float) i));
            }
        }
    }
    public static final long MASK_LOWER_32_BITS = (1L << 32) - 1;
    public static final long MASK_LOWER_16_BITS = (1L << 16) - 1;

    public static List<Long> reverse(long partial) {
        List<Long> found = new ArrayList<>();
        long offset = (partial & 1L << 31) >> 31;
        long upper = (partial >> 32) + offset & MASK_LOWER_16_BITS;
        long lower = partial & MASK_LOWER_32_BITS;
        long middle = lower << 16;
        for (int i = 0; i < 1 << 16; i++) {
            long lastOutput = ((middle | i) - 11) * 246154705703781L >> 16;
            long masked = lastOutput & MASK_LOWER_16_BITS;
            if (upper == masked) {
                found.add(lastOutput - offset << 32 | lower);
            }
        }
        return found;
    }

It's also SIGNIFICANTLY faster, more than 2x. I also removed the 48-bit mask in lastOutput, it's not needed since it will be masked to 16 bits below, and when it's used in the potential 64-bit expansion, it's leftshifted by 32 so the upper bits disappear anyway.

It finds ~1.23 possible values for the upper 16 bits (across 1,000,000 runs it found 1,226,418 valid 64-bit expansions).

Badel2 commented 5 years ago

Hi, thanks for your interest!

If I understand correctly, you are looking for a way to expand a 48-bit seed into a 64-bit seed by assuming that the seed was generated by Random.nextLong. I have already implemented that optimization here:

https://github.com/Badel2/slime_seed_finder/blob/b2035f78fefb6858e542909f7d6670d6637f66d5/src/java_rng.rs#L288-290

And it is even a dedicated field in the web demo: https://badel2.github.io/slime_seed_finder/

(scroll down, enter the 48-bit seed into the box below "Once you have", and it will show the 64-bit seeds in the textarea below, requires webassembly support in your web browser).

You may also be interested in a way to reverse the state of the Rng without bruteforce, see here:

https://github.com/Badel2/slime_seed_finder/blob/b2035f78fefb6858e542909f7d6670d6637f66d5/src/java_rng.rs#L165-L167

I should probably document this somewhere, maybe in a blog post explaining how it works, but I didn't have enough time for my projects lately.

leijurv commented 5 years ago

Ah, fantastic!

Did you run into the same sign issue as I did with the 31st bit of the 48-bit partial seed? I was using slime-seed's expander as a reference, however it doesn't work properly when the 31st bit is on, that's why I've made a pull request fixing it.

I found that it's always possible to expand a 48-bit seed to 64 bits (assuming it was generated by nextLong), but that slime-seed's implementation would only find it half the time.

I'm not adept at reading Rust, but from what I can see, that looks like the same as slime-seed's method?

I do reverse the state of the Rng without bruteforce, that's (x - 11) * 246154705703781L.

Badel2 commented 5 years ago

I remember that extracting the two ints from a long was a bit tricky so I created helper methods:

pub fn i1_from_long(l: i64) -> i32 {
    l as i32
}

pub fn i0_from_long(l: i64) -> i32 {
    ((l >> 32) + ((l >> 31) & 1)) as i32
}

Can you give me an example of two seeds, one that works with slime-seed's method and one that doesn't? This way I can check if my algorithm is correct.

leijurv commented 5 years ago

Sure! The columns are full seed, truncated 48-bit seed, what slime-seed finds, and what my thing finds:

4400149443144113101 132607203138509 [4400149443144113101] [4400149443144113101]
6895687433209288129 113453751637441 [4565074626045056449] [6895687433209288129, 955720999684314561]
3640896845709787754 18021957452394 [3640896845709787754] [3640896845709787754]
-1095369317440944327 131291916928825 [2513984308919797561] [-1095369317440944327]
-166384059012830051 249127199878301 [5773582374512143517, -166384059012830051] [5773582374512143517, -166384059012830051]
3353398099420370609 186701866325681 [] [3353398099420370609, -2586568334104602959]

Turns out in exactly half of those, slime-seed doesn't find the original seed.

Badel2 commented 5 years ago

Thanks, I have added that as a test in #2 and it works (my algorithm is correct).

leijurv commented 5 years ago

Fantastic! Thanks for taking the time to add a test for this.