golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.81k stars 17.51k forks source link

proposal: math/rand or crypto/rand: add random strings generators #53447

Closed kushuh closed 2 years ago

kushuh commented 2 years ago

The issue

There are use-cases where we need or could want to generate a random string:

For now neither math/rand nor crypto/rand provide a straightforward solution.

Proposal

Add a String function to any/both of the rand packages, with only a length parameter to generate a random string of any length.

package main

import (
  "math/rand"
)

func main() {
  id := rand.String(10) // djfrtyusao 
}

The method would only take a size parameter, that would determine the length of the final string (and possibly panic if this size is negative or too big?).

I think it should also generate a url-safe string, or even only alphanumerics/latin alphabet characters.

StringAlphabet

For a greater coverage, we could also include a more sophisticated method, that would accept an "alphabet" (a user generated set of elements to pick up as runes). Maybe the method could be named StringAlphabet? (I'm not the best at naming xD)

package main

import (
  "math/rand"
)

const alphabet = "1234567890" // here it would only generate a numeric string, but whatever

func main() {
  id := rand.StringAlphabet(alphabet, 10) // 2834753819 
}

Where the alphabet argument would be a string of allowed runes to pick up. Could be used for example to generate uuids, where alphabet would look like "0123456789abcdef".

package main

import (
  "fmt"
  "math/rand"
)

const alphabet = "0123456789abcdef" // for uuids

func GenerateUUID() string {
  a := rand.StringAlphabet(alphabet, 8)
  b := rand.StringAlphabet(alphabet, 4)
  c := rand.StringAlphabet(alphabet, 4)
  d := rand.StringAlphabet(alphabet, 4)
  e := rand.StringAlphabet(alphabet, 12)

  return fmt.Sprintf("%s-%s-%s-%s-%s",a, b, c, d, e)
}

Conclusion

I use some working examples I wrote in my packages, so the concept seems to work. However I don't know anything about pseudo-random generators and my solution may be far from optimized.

For the String method at least, I found this stackoverflow post that provides a very pleasant solution.

kushuh commented 2 years ago

I even thought of another nice addition while writing this proposal, but it might be extra complicated for nothing (that's why I'm sparing the main message).

The idea is to not be limited by single runes, but allow to basically pick up words (it can be used for name generators, or user-friendly urls ?).

package main

import (
  "math/rand"
)

func main() {
  id := rand.StringDictionary(dic, 3) // fish-joystick-robot 
}

I believe using dictionaries may only prove interesting if you have a very large collection, so just using a Go variable may be harmless in this case. Maybe make dictionary an interface that returns a numerically indexed element ?

type Dictionary interface{
  Size() int64
  Pick(pos int64) string
}

So that knowing the total size of the dictionary, the rand method could pick a random position and ask for the element located here. Dictionary could wrap anything like a database where the whole elements would be stored. I also don't think this particular addition needs a crypto/rand implementation, math/rand may be sufficient.

Jorropo commented 2 years ago

Nit picking but your UUID implementation is not up to RFC4122.

Example in the github.com/google/uuid lib: https://github.com/google/uuid/blob/44b5fee7c49cf3bcdf723f106b36d56ef13ccc88/version4.go#L53-L54

kushuh commented 2 years ago

Nit picking but your UUID implementation is not up to RFC4122.

Yes I gave it as a (probably bad) example of what using custom alphabets could look like.

I think the String method alone is sufficient to cover most use-cases, but having the possibility to build from a custom set of runes sounds nice.

seankhliao commented 2 years ago

StringAlphabet looks questionable, what happens if I put a multi codepoint character in there, like an emoji requiring joiners?

as for simple strings, as pointed out above, uuids actually have required set bits, while in other cases it's simple enough to pass the output of rand.Read through a hex/base32/base64 encoding to get a "safe" string

kushuh commented 2 years ago

@seankhliao Maybe StringAlphabet could accept a []byte or []rune argument, instead of a string. This would prevent multi-codepoints characters, and strings can be easily converted:

package main

import (
  "math/rand"
)

var alphabet = []rune("1234567890")

func main() {
  id := rand.StringAlphabet(alphabet, 10) // 2834753819 
}
kushuh commented 2 years ago

it's simple enough to pass the output of rand.Read through a hex/base32/base64 encoding to get a "safe" string

I did not knew about this solution, although it seems less flexible and straightforward than having a string method where you explicitly control the size of the output.

Jorropo commented 2 years ago

I did not knew about this solution, although it seems less flexible and straightforward than having a string method where you explicitly control the size of the output.

There is plenty if solutions already:

rittneje commented 2 years ago

Perhaps to be really flexible:

// Choice randomly picks items from input with replacement and copies them to output.
func Choice[T any](input, output []T) {
   ...
}

Then you can pass a []byte/[]rune to generate a random string, or a []string to generate a random "sentence" (and use the existing strings.Join on the result), or so on.

I called it Choice because of numpy's random.choice but a different name would also be fine as long as it is clear.

rsc commented 2 years ago

It seems like this could be done in a separate package outside the standard library. Being able to say exactly what kind of random string you want is a lot of API, since different use cases will want different kinds of strings.

rsc commented 2 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

dottedmag commented 2 years ago

A data point: a middle-sized system (200k LOC Go) that talks to dozens of other system. There is a function in a shared private library similar to the one proposed here. It is used 3 times, and around 20 other variants of getting a random string are spread around the codebase, as the requirements for these strings are peculiar, and can't be easily expressed declaratively. We're going to remove library function.

rsc commented 2 years ago

Based on the discussion above, this proposal seems like a likely decline. — rsc for the proposal review group

rsc commented 2 years ago

No change in consensus, so declined. — rsc for the proposal review group

adamluzsi commented 2 years ago

@kushuh, if you still need this functionality, try out my random package:

It is made to have deterministic randoms during tests, but the package itself doesn't depend on the testing package. It can also be used with a Crypto random seed.

package main

import (
    "math/rand"
    "time"

    "github.com/adamluzsi/testcase/random"
)

func main() {
    rnd := random.New(rand.NewSource(time.Now().Unix())) // or random.New(random.CryptoSeed{})

    _ = rnd.StringNC(42, random.CharsetAlpha())
}