openrewrite / rewrite

Automated mass refactoring of source code.
https://docs.openrewrite.org
Apache License 2.0
2.1k stars 314 forks source link

Replace `String.replaceAll` with a compiled `Regular expression` #1659

Open yeikel opened 2 years ago

yeikel commented 2 years ago

String replaceAll uses a compiled regular expression. In "hot" methods this often leads to performance problems due to the often need to GC.

In an internal library, replacing this pattern resulted in a considerable speed-up and I expect the same for many others

This recipe could do "harm" if the method is not "hot" as the compiled expression will take unnecessary memory space. We should warn users about this in the recipe description and let them review the diff manually


public String replaceAll(String regex, String replacement) {
        return Pattern.compile(regex).matcher(this).replaceAll(replacement);
    }

class A {
   public void replace(){
           String init = "Bob is a Bird... Bob is a Plane... Bob is Superman!";
            String changed = init.replaceAll("/[@]/g,", "_");
}
}

class A {

    private static final java.util.regex.Pattern p = Pattern.compile("/[@]/g,"); // compile it only once and reuse it

   public void replace(){
           String init = "Bob is a Bird... Bob is a Plane... Bob is Superman!";
           String changed = p.matcher(init).replaceAll("_");

}
}

Same applies to replaceFirst

sebastiankonieczek commented 1 year ago

Could you please assign it to me? I would attempt it in coming weeks...

timtebeek commented 1 year ago

Could you please assign it to me? I would attempt it in coming weeks...

I'd missed your comment earlier @sebastiankonieczek ; thank you for your offer to pick this up!

I do have some concerns about the do-no-harm aspect of this recipe; it could be challenging to determine what the "hot" code paths are, and applying this recipe everywhere might not improve memory use or code readability. Perhaps we should limit the recipe applicability in some way initially, for instance only where used in streams or loops? Open to other suggestions.