policeman-tools / forbidden-apis

Policeman's Forbidden API Checker
Apache License 2.0
321 stars 34 forks source link

Consider a bundleded signature set that would suggest replacing HashMap, HashSet, Collectors.toSet, Collectors.toMap with Linked alternatives for deterministic oder #199

Open vlsi opened 2 years ago

vlsi commented 2 years ago

HashSet, HashMap, Collectors.toSet(), and similar APIs do not keep the element order, and it might result in surprising and unexpected results.

What if error-prone could catch those types of errors and suggest replacing the calls via LinkedHash*?

See https://github.com/jlink/jqwik/pull/348 See https://github.com/google/error-prone/issues/3230

uschindler commented 2 years ago

Technically this is easy to do. With forbiddenapis 3.3 it is also now easier to forbid all overloads of a method. You can also forbid the whole class, but this may lead to problems with API signatures expecting those classes.

I am not sure if this should be shipped with forbiddenapis. Maintaining the existing signatures is painful already. If one would publish the signatures on Maven Central as artifacts, one can easily use them in Maven, there's direct support to link to them. In Gradle it also works but not as intuitive.

A signature file could look like:

@defaultMessage Consider using LinkedHashSet
java.util.HashSet#<init>(**)
java.util.stream.Collectors#toSet()

I'd suggest to put something like this into your own build.

vlsi commented 2 years ago

1) Unfortunately, it is not really easy to put JDK-related signatures to custom builds since forbidden-apis forbids "unknown signatures".

For instance, Java 10+ has Collectors#toUnmodifiableSet, so the signatures easily become version-dependent. Then, there's a non-trivial step to figure out the needed Java.

2) Even though the signatures are small, I believe they are generally useful. For instance, in jqwik case, I did found a couple of bugs caused by hidden reliance on HashSet. In other words, the test outcomes looked nice and shiny, however, the outcomes were just wrong, and HashSet masked the results.

3) Kotlin instantiates LinkedHashSet for mutableSetOf<K, V>() which, I believe, is nice to avoid accidental bugs. Of course, it might be a bit slower, however, I believe it makes sense to keep the order by default.

4) Stream#distinct() keeps order by default, and its javadoc explicitly mentions that users should opt-in with .unordered() if they do not care on the order of the elements.


Here's how I put the signatures for now, however, I would say, it took me significantly more time than I wanted, mainly because the signature file does not support things like "if (java10+)".

https://github.com/jlink/jqwik/pull/348/commits/0c3f049e3e8eea00d8ae18ef7259a32627149b53 (it is the part of the PR348 which I mention in the first comment)