google / guava

Google core libraries for Java
Apache License 2.0
50.15k stars 10.89k forks source link

Random string generation API #2113

Open lvxiang opened 9 years ago

lvxiang commented 9 years ago

Hi, I’ve searched through Guava API, looking for a random string generator, but found none. And I’ve looked at this thread which suggested using the BaseEncoding in some way for your purposes. I’m not a big fan of this solution. For one thing you have to know BASE64 very well so as not to make any mistakes, such as forgetting to omit paddings. Moreover, it’s not an instant solution as most APIs usually do, you have to think twice to come up with the idea. On the other hand, the solution is too detailed to allow flexibility, consider the following cases:

You have to write your own code to cover the cases above. There are other issues to consider:

Personally, I often come across requirements for random strings in various formats, and I see a good reason for Guava to provide dedicated APIs for generating random strings. The fluid-style can perfectly fit into this situation. Please let me know your thoughts on this issue.

thiagokronig commented 9 years ago

See http://stackoverflow.com/questions/41107/how-to-generate-a-random-alpha-numeric-string

Construct a BigInteger with N bits randomly obtained from a Random source and encode it in Base-32.

Cheers,

Thiago Kronig

On Tue, Jul 21, 2015 at 11:37 AM lvxiang notifications@github.com wrote:

Hi, I’ve searched through Guava API, looking for a random string generator, but found none. And I’ve looked at this thread http://stackoverflow.com/questions/20782919/does-guava-have-a-method-to-generate-random-strings which suggested using the BaseEncoding in some way for your purposes. I’m not a big fan of this solution. For one thing you have to know BASE64 very well so as not to make any mistakes, such as forgetting to omit paddings. Moreover, it’s not an instant solution as most APIs usually do, you have to think twice to come up with the idea. On the other hand, the solution is too detailed to allow flexibility, consider the following cases:

  • What if some chars are not supposed to appear in the string. I might want to ignore ‘I’ and ‘l’ for they look very alike.
  • There is no way to generate a unicode string.
  • What if I want all letters in their capital forms?

You have to write your own code to cover the cases above. There are other issues to consider:

  • There’s definitely a better solution with better performance than using BaseEncoding.
  • There’s no formal proof to the randomness of strings generated.

Personally, I often come across requirements for random strings in various formats, and I see a good reason for Guava to provide dedicated APIs for generating random strings. The fluid-style can perfectly fit into this situation. Please let me know your thoughts on this issue.

— Reply to this email directly or view it on GitHub https://github.com/google/guava/issues/2113.

lvxiang commented 9 years ago

@thiagokronig checked that before, still not solving all the problems

Maaartinus commented 9 years ago

This answer uses no BigInteger and is pretty general. It also seems to be optimal (except for using StringBuilder where char[] would do).

lvxiang commented 9 years ago

@Maaartinus I don't think it's general enough as you have to redefine "AB" constantly. Except for StringBuilder, it might not be optimal in some cases, i.e, if all you want is a string of decimal digits, the following code might be faster:

Random rng = new Random();
char[] str = new char[length];
for(int i = 0; i < length; i ++)
    str[i] = (char) ('0' + rng.nextInt(10));
return new String(str);
ogregoire commented 9 years ago

I agree with @lvxiang in the sense that usually people have a set of constraints and want to generate strings according to those sets of constraints, without having to redefine them everywhere.

You can have something like a step-based mechanism in which you first define/compile constraints and then you can fastly generate strings based on those constraints.

Answers saying that you can "simply" store your string of acceptable characters discard the need for modularity, reusability and ease. The "simply" is only the core of the issue and a lot can be done around it to make it better for the developer. It's like reinventing the wheel all over again: we want to avoid that in development ;)

Therefore, all I can recommend today is to check the library Passay or, without trying to put any light on it, my modest take on the problem.

With these libs, you can declare which generator you want and then generate as much as you want very very easily.

lvxiang commented 9 years ago

@ogregoire like your Rule and Ruler idea, almost same as what I have in my mind. You might consider taking the fluid-style by introducing something like a RuleBuilder.

ogregoire commented 9 years ago

@lvxiang Well, I'm glad you like it. If you have suggestions, please file an issue there and let's continue this conversation there as well.

This issue here should stay focused on the integration of your idea into Guava.

lvxiang commented 9 years ago

@ogregoire agreed

mrniko commented 9 years ago

+1

kpavlov commented 8 years ago

+1. I want to get rid of Apache's commons-lang3 but I need org.apache.commons.lang3.RandomStringUtils

bobbui commented 7 years ago

+1

gk5885 commented 7 years ago

FWIW, this all seems easy enough to do with streams. I threw together this little snippet to print 10 strings of 10 random ASCII (though you could choose what ever codepoints you want) characters:

Random random = new Random();
Stream<String> randomStrings =
    Stream.generate(
        () ->
            random
                .ints('a', 'z')
                .limit(10)
                .collect(
                    StringBuilder::new,
                    (builder, codePoint) -> builder.appendCodePoint(codePoint),
                    StringBuilder::append)
                .toString());
randomStrings.limit(10).forEach(System.out::println);

Filtering, transforming to upper case, etc. are all easy enough to implement as further stream operations. Given that it's straightforward enough to get a random string from the APIs in the JDK, I'm having a hard time imagining that this is such a common problem as to warrant its own, specific API in Guava -- a specific API would be more readable, but probably too niche.

ogregoire commented 7 years ago

I just searched "random string java" in Google and got 381k hit. Doesn't seem very niche to me.

gk5885 commented 7 years ago

We generally judge need based on evidence within Google's (very large) code base, not Google searches. By that metric, there is a much stronger need for an API to generate random cats given that "random cat java" produces 940k results. :)

ogregoire commented 7 years ago

Okay, you got me there. I'll try to find a good API to generate random cats then ;-)

More seriously, there are implementations that are fed here, there is a request, there is a real need from several of your user. All that's left to the Guava team is to review and to accept one of the PRs. Leave it in beta for a few releases and see if it's used. If not, drop it.

Le mar. 10 janv. 2017 à 04:40, Gregory Kick notifications@github.com a écrit :

We generally judge need based on evidence within Google's (very large) code base, not Google searches. By that metric, there is a much stronger need for an API to generate random cats given that "random cat java" produces 940k results. :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/guava/issues/2113#issuecomment-271478489, or mute the thread https://github.com/notifications/unsubscribe-auth/AAotWmLvXnfODqgRVqNWd-QUYbd1l_etks5rQv2ogaJpZM4FczoY .

ogregoire commented 7 years ago

Oh, by the way, @gk5885, have you seen any 'z' printed by your snippet?

There we go! The problem is more complex than one can expect and errors happen so easily. For working on the Guava team, you're probably a good programmer and it was rather easy for you to make that snippet. But check everywhere else, the problem is not a trivial one. At least not as trivial as you think. Working with randomness is hard because it's not easily testable, and bugs come quick (yup, even for a talented programmer like you as we just saw ;).

I'm kind of sad that the only metric for Guava is "we see it often in our codebase". Another good one would be "it seems simple, but it's a magnitude harder than that".

jrtom commented 7 years ago

It's not the only metric, but it's an important one. Utility is the other primary consideration, and obviously the difficulty of implementing a correct solution is an aspect of that.

Maaartinus commented 7 years ago

@ogregoire I'd say that what you wrote is a generator for passwords rather than for arbitrary strings. At least that how it'd be probably used. But then plain Random must not be the default as it's insecure.

I guess, it could get nearly as popular as random java cats, except for that there are many options you didn't cover. (*)

(*) And nobody can cover, as there are just too many strange wishes.

ogregoire commented 7 years ago

@Maaartinus Yes, I know! I have somewhere under my elbow a password generator that can easily take those cases into account, but that's not the point. I tried to check most of the use cases as seen on Stack Overflow. And while the options you mention aren't unknown, they are much more rare (though the next(Random) always was my personal favorite, it's a burden on the user).

But then, this is a free software (as in speech) and discussions and improvements are welcome, I guess! I never said I hold the one truth and if I overlooked at some things, like the SecureRandom as default, I humbly ask to have them pointed so I can suggest a better alternative.

@jrtom It's basically the only metric we hear of in the past few years. Yes, there are others, but that aspect is rather opaque, from a point of view external to Google.

Steve973 commented 1 year ago

Yes, this is old, but here's some better code for this:

    static final Function<Integer, String> randomNumeric = (lim) ->
            new Random()
                    .ints(lim, '0', '9' + 1)
                    .collect(
                            StringBuilder::new,
                            StringBuilder::appendCodePoint,
                            StringBuilder::append)
                    .toString();

    static final Function<Integer, String> randomAlphabetic = (lim) ->
            new Random()
                    .ints(lim, 'a', 'z' + 1)
                    .collect(
                            StringBuilder::new,
                            StringBuilder::appendCodePoint,
                            StringBuilder::append)
                    .toString();