tinygo-org / tinygo

Go compiler for small places. Microcontrollers, WebAssembly (WASM/WASI), and command-line tools. Based on LLVM.
https://tinygo.org
Other
15.32k stars 905 forks source link

tinygo does not quite use a random offset for map iteration? #3726

Open dkegel-fastly opened 1 year ago

dkegel-fastly commented 1 year ago

Go for some time has used a random offset for map iteration. I think Tinygo tries to, but since fastrand() is always initialized with a seed of 1, every time you run the program, map iteration is in the same order. This confused me quite a bit, and it might bite other programmers, too.

Presumably it would be easy to fix this by tiptoeing closer to upstream, e.g. making each runtime provide a cputicks() that can be called safely very early, and using that in runtime/algorithm.go to initialize xorshift32State and xorshift64State?

machine.GetRNG() and getentropy() sound tempting, too, if available, available early enough, and cheap.

aykevl commented 1 year ago

Yeah we could do that, if needed.

I should point out that the Go specification does not require any randomization, it just says the order is undefined. So we're perfectly fine spec-wise. But we could indeed use some random numbers on supported hardware.

Of course, the randomness doesn't need to be particularly strong. It's really just there to avoid the common mistake of relying on the order of map iteration (I know I have made that mistake several times, and randomness makes finding such bugs much easier).

dkegel-fastly commented 1 year ago

Yeah. wasi does provide getentropy(). If I get inspired, maybe I'll do a PR to explore the idea on a couple platforms.

dgryski commented 1 year ago

I think the impact of initializing the fastrand seed at program startup is pretty small. It ends up making things a bit less deterministic but overall I still think that's an improvement.

dgryski commented 5 months ago

The map code uses randomess to seed the hash function but not for the starting point in the iteration code. The random-starting-point for map iteration is tricky, but doable (if we need to).

dkegel-fastly commented 5 months ago

It might avoid some mysterious problems (and would let me get rid of some workarounds in my app...)

aykevl commented 5 months ago

(and would let me get rid of some workarounds in my app...)

What kind of workarounds? (I don't see how a spec-compliant program would fail with non-random map indices? Because all the spec says is that the iteration order is undefined, not that it is random).

dgryski commented 5 months ago

For the record, the way I would implement this (and the way the Big Go runtime handles it) is to randomly select a bucket and a starting index and store those in the interator structure, then proceed as normal, wrapping around when we reach the end of the list of bucket, and then stopping once we hit the same bucket and index.

dkegel-fastly commented 5 months ago

Example problem that I have to work around: given a list of five things, of which I want to use one at random, I iterate through the list and pick the first one that is legal at the moment.
This works great in go, not so much in tinygo.

dgryski commented 5 months ago

@dkegel-fastly As @aykevl points out, the iteration order is undefined, not random. When I've needed to do that sort of thing in the past, I've maintained a second slice of keys and chosen randomly from that slice. (I had my own type that made sure the slice and map keys were in sync.)

dkegel-fastly commented 5 months ago

There's the standard, and then there's the reference implementation. Most people test against the reference implementation, not the standard. And I suspect 'go vet' etc. aren't going to be able to notice this particular issue.

So, I'm happy to Do Things Right in my code, but I did want to share that example of a program that worked fine with go but had a mysterious problem with tinygo.

inliquid commented 5 months ago

The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next.

Randomization needed to make sure no one relies on specified order. It's to forcefully break any non-compliant to spec program, so makes perfect sense. I can imagine someone start relying on specific order when iterating over map.

aykevl commented 5 months ago

There's the standard, and then there's the reference implementation. Most people test against the reference implementation, not the standard. And I suspect 'go vet' etc. aren't going to be able to notice this particular issue.

True... but we can't promise to have that level of compatibility. There are many ways tinygo is subtly different, this is just one such case (but sadly is one that's hard to detect with tools and doesn't clearly flag itself as unsafe for example).

We could fix this, but I'm not sure we should.

dkegel-fastly commented 5 months ago

There is a risk that people will write code that depends on the order used by Tinygo, and then get upset when they try also building on Go (or vice versa).

It's a tossup: do we want to say "you should have read the spec!", or do we want people's code to work the same in both environments?

(I wish there were a flag in "go test" and "tinygo test" to run tests twice, once with that randomization on, and once with it off. That would catch the problem.)

dkegel-fastly commented 1 month ago

We just got bitten by this again, sigh.

dgryski commented 1 month ago

I'll see if I can look at this "soon".