The following guidelines target best/agreed practices for working with source code.
[[TOC]]
General
The following guidelines apply to all code.
These Guildelines Take Precedence over Counterexamples in the Code
Counterexamples to these guidelines that exist in the codebase are not justification for ignoring the guidelines. The guidelines were created after most of the code in the application was written. In addition, it's also not practical to expect every code review to catch every violation.
Follow the tenets of the 3rd edition of the book Effective Java... except in the following cases:
Item 19: Design and document for inheritance or else prohibit it -- while this is important advice for library designers, it's less important for application developers. You can definitely try following this guidance, but our ability to test code comes first (e.g. final classes may prevent the creation of mocks).
Item 48: Use caution when making streams parallel -- we have a stricter rule listed elsewhere in this guidance: (Almost) Never use Parallel Streams.
Item 56: Write doc comments for all exposed API elements -- commenting is important... when it adds values. The equation is a little different for library writers vs application writers. So, yes, write comments to make things clearer, but also don't write comments for methods where the behavior is clear or, worse yet, write comments that simply repeat the name of the method. This idea is reinforced elsewhere in this document.
Item 69: Use exceptions only for exceptional conditions; Item 70: Use checked exceptions for recoverable conditions and runtime exceptions for programming errors; and Item 71: Avoid unnecessary use of checked exceptions -- we have our own policy on this, covered in another section of this guide. It takes a stricter view on when checked exceptions are allowed.
Avoid Double Negatives in Conditional Expressions
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
if(object != null) { // This is a (single) negative
...
} else { // The else makes it a double-negative
...
}
// INSTEAD, DO THIS
if(object == null) { // This is a positive
...
} else { // The else makes it a (single) negative
...
}
Explanation
The key in the above example is the presence of the else branch, which automatically negates the if condition. If the if condition is stated in a negative way, then the else branch results in a double negative.
Double negatives are confusing to understand. E.g. "I don't not live in the United States" or "I haven't not gone to work early", both of which can be made much simpler by expressing them with the equal positive statements "I live in the United States" and "I have gone to work early (or on time)".
Use CamelCase for Acronyms and Initialisms
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
HTTPRequest hTTPRequest = ...
Map<Id, HTTPRequest> idToHTTPRequest = ...
HTTPXMLEntity hTTPXMLEntity = ...
HTTPXMLEntity httpXMLEntity = ...
// INSTEAD, DO THIS
HttpRequest httpRequest = ...
Map<Id, HttpRequest> idToHttpRequest = ...
HttpXmlEntity httpXmlEntity = ...
Explanation
Using camel case consistently helps readers figure out where one concept ends and another begins. Using all capital letter may feel okay for single and well-known acronyms/initialisms, but problems arise with multiple, less-well-known strung back-to-back.
Notes:
If you don't control the type (e.g. it comes from a 3rd party library) you may be stuck with HTTPRequest, HTTPXMLEntity, etc. as class names. However, you should still use camel case for the variable names (httpRequest, httpXmlEntity, etc.)
Use the xToYs Style When Naming Map Variables
Given a Map of type X to Y, name any such Map variable as xToYs.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
Map<Bicycle, Part> parts = ...
Map<Bicycle, Part> bicycles = ...
Map<Bicycle, Part> bicycleParts = ...
// INSTEAD, DO THIS
Map<Bicycle, Part> bicycleToParts = ...
Explanation
The naming pattern for other collection class is well-understood: pluralize the item name (e.g. List<Bicycle> bicycles). Different people tend to use different strategies for Maps, though, which causes unnecessary confusion. We believe the xToYs pattern has the most advantages, which is why we want to standardize on it.
Notes:
The variable name doesn't have to restate the type exactly. The important part is the form: conectionToconcept. E.g. as with other collection classes
Abbreviations are still fair game -- e.g. if you want to use bike for conciseness, the bikeToPart would be okay.
Completely different words than the type are allowable -- e.g. if you had two maps, one for old bikes and one for new ones, oldToParts and newToParts would be okay.
Don't Write Long Methods
Longer methods are more difficult for readers to reason about. Readers have to maintain the context of the entire function in their head to understand it fully. The longer the method, the more difficult this is to do, and as the user modifies one section of the code, another section may be affected in an unknown way.
To avoid this issue, break larger methods down into smaller ones. Shoot for a maximum length of 20-30 lines but never more than 50. Also, don't be afraid to create a function of even only a few lines, or sometimes even one line, if you think it makes the method more readable.
Here are some good heuristics for when to move code to a new method:
Code blocks that have been vertically separated from other parts of the code
Code blocks that have comments explaining what they do. If you move them to a new, well-named function, you may even be able to get rid of the comment.
Minimize Indentation in Methods (to a Max of 2 Levels)
Example
try{ // Level 0
if(...) { // Level 1
for(...) { // Level 2
while(...) { // Level 3 -- DON'T DO THIS. THIS IS AN ANTI-PATTERN.
}
}
}
}
Explanation
Each level of indentation adds complexity. All of the details of every level are exposed to the reader. The reader mentally has to maintain the context of every previous level while creating a new context when encountering the next level. Do the contexts for the outer levels mix with the current one? If so, in what way? Etc.
To reduce complexity, move code blocks to new helper functions. These helper functions both create useful abstractions, allowing the reader to neglect details, and help the reader understand how different levels interact (limited through the function signature). As an example, one thing I almost always do is implement the body of any sufficiently-complex try block in a separate method.
Notes:
Lambdas add some nuance to this discussion. On the one hand, a new lambda creates a new level of indentation. On the other hand, if this level is succinct, the cost compared to a usual level of indentation is not so much. So, there is some give and take here.
Try to Order Method Definitions Like a Newspaper
(From the book Clean Code) order method definitions by level of abstraction, with the caller preceding the callee.
In this example, methods are not organized by access level (i.e. public vs. private vs. ...). Instead, they're organized by abstraction level... almost like a depth-first search. method1 first calls method2 and then method4, so method2's definition comes next in the file. Method2 calls method3, so method 3's definition comes next. Finally, we go back to the second method that method1 called and define method4.
Explanation
The idea is to minimize the vertical separation between dependent methods. E.g. while reading method1 in the example above, if I want to know the details of method2, then I don't have very far to go.
This should be the general rule, anyway. There may be times where there are so many methods, that a strict following of the rule makes the code less readable. There also may be times where the cohesiveness of several functions is more important. These situations should be the outliers, though.
Use "Final" When Necessary (and Only When Necessary)
Use final on instance and static fields as much as possible.
Only use final on classes and methods if you're trying to prevent subclasses or abuse by subclasses.
Never user final on method parameters or local variables (unless java forces you to, like it used to do for anonymous classes, even if the variable was "effectively final"). This keyword clutters the code and adds little justifying value. Yes, having immutable object references is important... but if you can't tell whether a variable is reassigned within a method, that usually means the method is too long.
TODO Comments Must be Linked Somewhere Trackable
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
// TODO: initialize this in a constructor
int numWheels = 0;
// INSTEAD, DO THIS
// TODO (http://link.to/somewhere/trackable): initialize this in a constructor
int numWheels = 0;
Explanation
TODO comments that have no tracking link are often useless. The author may have had good intentions (e.g. wanting to improve the code). Unless someone is accountable, though, the comment often just sits in the code. Other developers often have no context and are left wondering why it exists, who plans on making this change, when it can be made, etc.
If you want to add a TODO comment, it must be accompanied by a trackable link. E.g. use a GitLab issue. The link must contain more backstory on why the change was not made in the first place and what needs to be done before it can be made. Management then needs to be made aware of this link if not already, so that they can track it and make sure it gets done eventually.
A good example of using a TODO comment would be trying to implement a project in phases. Assume you're assigned a GitLab issue for the project. Say you make changes for the first phase. As a part of this change, you add a TODO with a link to the GitLab issue for changes you want to make in the second phase. Once you get to the second phase, you make the changes and remove the TODO comment. In this case the TODO comment serves as a sort of "I know this code isn't complete, but I promise to come back and change it... and if I don't, here's how to hold me accountable."
Write Useful Comments
Add comments to code when what a piece of code does is not obvious or you want to provide more context on why it exists.
Example
class Domain {
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
/** Determines if this domain is OPS. */
public boolean isOps(){
}
// INSTEAD, DO THIS
public boolean isOps(){
}
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
/** Determines whether this domain is internal. */
public boolean isInternal() {
}
// INSTEAD, DO THIS
/** Answers whether this domain is visible only to internal clients, not to external customers. */
public boolean isInternal() {
}
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
/**
* Answers whether this domain is only visible to internal clients, not to external customers.
*
* @return whether this domain is only visible to internal clients, not to external customers.
*/
public boolean isInternal() {
}
}
Going through the examples above:
isOps -- Assuming that "ops" is a well-understood concept, there's no need to add the comment "Determines if this domain is OPS." This just repeats the class and method names.
isInternal
Repeating the name of the class and/or method name adds no value (i.e. "Determines whether this domain is internal.").
Strict adherence to javadoc norms can be wasteful. E.g. repeating the main javadoc comment in the @return section.
Explanation
Every comment should add value. Call-outs:
Simply repeating the name of the class/and or method name in the comment adds no value.
Don't add a comment simply to follow a standard. Adding value is more important. Valueless comments distract.
If you're commenting on what a method does, consider if renaming the method or reorganizing the code would make it such that no comment would be needed at all.
Don't be afraid to comment on why a code is doing what it's doing. Put yourself in the mindset of someone looking on this code who has no context. Would they be confused? Would their natural instinct be to change it? Is there good reason not to? Is there a good backstory? Add a comment explaining it.
Add a Justifying Comment Whenever You Suppress a Warning
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
@SafeVarargs
public static <T> List<T> doSomething(T... items) {
}
// INSTEAD, DO THIS.
// Suppress justification: items is not being manipulated or returned from the method.
@SafeVarargs
public static <T> List<T> doSomething(T... items) {
}
Explanation
Suppressing a warning is a big step. You're saying that you know better than the compiler. You could be wrong. By explaining why you think you know better, this will both help you get your thoughts straight and justify the decision to readers. If, in the future, a reader comes along and doesn't think the explanation is sound, they can feel safer modifying the code accordingly.
(Almost) Never use Parallel Streams
Don't use Java's in-built parallel stream. If you want to divide-up work, find another way to do it.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
bicycles.parallelStream()...
// INSTEAD, DO THIS
bicycles.stream()...
Explanation
parallelStream uses the application's single, common fork join pool for associating a thread with each task to perform. Blocking tasks make threads in the pool useless. With enough blocking tasks, overall application performance suffers, to the point of being unable to progress.
Notes:
Yes, the documentation says not to use blocking tasks for parallel streams. However, it's too easy to do this without knowing it. Why accept the risk?
Yes, you can change the size of the common fork join pool. However, at that point, you should just use a custom pool for executing the task.
Yes, it's possible to have parallelStream use a custom thread pool instead of the common one. This would be the only acceptable way to use parallelStream. This is similar to using a "normal" thread pool for "normal" parallelization.
Organize Classes in Java Packages Logically
Organizing code into classes can be difficult enough. Organizing those classes into Java packages can seem more difficult, but with similar principles, it doesn't have to be. Here are some heuristics you can use:
The article describes four different ways to think about organizing classes into Java packages: by component (aka feature), layer, toolkit, and kind. You can even use those ways in combinations to get the best of both worlds. For example, some web apps divide code first by feature and then by layer. (Note: as stated in the article, be extremely wary about organizing code by kind.)
Strive for Logical Cohesion
The classes inside of a Java packages should be logically cohesive.
What is the purpose of the Java package? Try to describe it in a single sentence. Does a class contribute to that purpose? If so, consider including it.
Would the majority of Java package consumers want to use that class? If so, consider including it.
Don't make Java Package Dependencies Circular
Classes in one Java package (call it Consumer) will often depend on those in another Java package (call it Dependency). All of those dependencies should point one direction: don't have the Dependency package also depend on the Consumer package.
Why? Circular dependencies blur the line between the Java packages: Consumer can't do what it does without Dependency... but if the opposite were also true, there would be no distinct identity between the two. The single Java package would really just be masquerading as two. (As a thought exercise imagine that each Java package went in its own Maven package, which forbids circular dependencies.)
Strive for reuse
Imagine the Java package as providing additional functionality that any other package can utilize simply by depending on it.
Isolate Transitive Dependencies
The easiest way to think about this is to consider Maven package dependencies. The more transitive Maven packages that a Java package includes:
The larger the consumer will be
The more likely it is that consumers will run into incompatibility problems with other packages they consume (i.e. dependency hell).
Use Maven dependencies as a heuristic for deciding when to create another package. E.g. imagine that you're developing a Java package without any Maven dependencies so far. Suddenly, you have to add a dependency on an XML parsing library. Not all clients may want this XML functionality. So, ideally, you don't want to force this dependency on them: give consumers a choice. Create another Java package and consider that it's "adding an XML feature to the core Java package you've been creating".
Note: sometimes/usually this is just a thought exercise. To work in reality, this scheme would require a separate Maven package for the separate Java package. That's the only way clients could decide to exclude the transitive Maven dependency. For simplicity, our code doesn't actually do this.
Avoid Generic Constants Classes
Sometimes, it can feel easy to create a single Constants class for the entire application and put all constants inside of it. There are several problems with this organization:
This scheme is similar to the "by kind" organization of Java packages mentioned above. You end up with parts/constants that don't have any cohesion living side by side. The same class ends up having multiple identities, and it becomes difficult to find and group constants of a single purpose.
The number of constants you put in the file can become overwhelming, again making it difficult to find the constant you want.
You're limited in what kinds of constants you can include... unless you allow unnecessary and potentially circular dependencies. (In order for two consuming classes to share a constant, the Constant class and those consumers must all be able to reference the constant's type definition. This is easy if the constant is a primitive or core library type (e.g. String). User-defined types can cause complications, though. Should/can this type definition live alongside/with the Constants class? Does that make logical sense? Should the Constants class depend on the Java package that defines the type, a package that may itself contain one of the Constants consumers?)
To avoid this, think about the following heuristics for deciding where a Constant should go.
Identify a Logical Owner
Is there a single, logical class you can identify for containing this Constant, such that all consumers can depend on it?
For example, consider the maximum value that an int can take on in Java. This value is defined inside of the Integer.MAX_VALUE class itself. Integer makes sense as the logical owner of the class. The JDK, itself, doesn't contains a generic Constants class that defines this value.
Now, as a counter example, you may point to the Collections class, which seems to contain a lot of different types of Constants (e.g. empty lists, sets, maps, etc). You could make the case that those constants rightfully belong elsewhere (e.g. you could argue that the empty list belongs on the List class). You wouldn't be wrong there. However, that doesn't mean that Collections is wrong either. It's serving as a facade, exposing a group of constants that have some logical cohesion: they're all collections of the types defined by the JDK itself. There is some meaning to them, and users would know to go there to find them.
Look to Class Organization
In large methods, people will often group code into blocks and add a comment to describe what the block does. This (as mentioned elsewhere in this document) is a heuristic for deciding when to break-up a large method into smaller ones. The same is true of Constants classes.
If you find yourself vertically separating a group of constants and adding comments to explain what that group is, that's a good sign that this group belong in another class. Each Constant may have its own owner, or all of the Constants may actually belong together... but just in their own class.
Use enums
Using enums over primitive or library classes has the following benefits:
References to the enums persist inside of dependencies. E.g. with Strings, the Java compiler will replace uses of the String constant with its String value, rather than maintaining the reference. This is why a warning message pops up in Eclipse, when you try to find all references of a String constant.
By defining an enum class, this forces you to come up with a good concept name that describes all types of constants in the file. E.g. say there's one group of constants in the file that represents email message types; in that case EmailMessageType might be a good enum name. If there are other constants in the class that don't fit this classification, it could indicate that they belong in their own, separate enum.
Logging
The following guidelines apply to logging.
Don't Log Sensitive Information
When you log, make sure that you're not logging sensitive data, even if it would be helpful to debug. Examples include Personally Identifiable Information (PII), credit card information, etc. Besides obfuscating sensitive information directly in the log statement, for complex objects, make sure all toString implementations obfuscate as well.
Log When Useful, no Less and no More
If you log too little, you won't have enough information to debug issues in production. If you log too much, you'll add noise to the logs, making debugging more difficult. Hitting the sweet spot can be difficult, so here are some rules that can help:
Who, What, When, Where, Why, and How
(From this article) use these six questions as guidance. Can you answer them all with the current logging in place?
Run Thought Experiments
Imagine potential failure scenarios in production. What questions would you ask yourself (e.g. what was the value of variables X, Y, etc.)? Are there enough logs, and do the logs contain enough data to help you answer those questions?
Remember and Capture Context for Exceptions
Exceptions provide useful debugging information, but as you let them propagate, they can lose context. E.g. a NullPointerException from a constructor is useful, it only captures the variable that was null. What about the method that called the constructor? What were its arguments? As necessary, you can supplement exception information by catching it and wrapping it in a new exception, using the new message to capture that context. That way, when the exception is logged, it will provide what you need. (Note: Spring does a great job of this.)
Exception Handling
The following guidelines apply to exception handling.
Don't Catch the Exception Class Without Either Propagating It or Accounting for InterruptedException
Whenever you catch Exception (or InterruptedException directly), you have extra work to do.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
try {
...
} catch(Exception e) {
// code that logs xor rethrows e as an unchecked exception
}
// INSTEAD, DO THIS
try {
...
} catch(IOException | RuntimeException e) {
// code that logs xor rethrows e as an unchecked exception
}
// OR (RELUCTANTLY) DO THIS
try {
...
} catch(Exception e) {
Thread.currentThread().interrupt();
// code that logs xor rethrows e as an unchecked exception
}
// OR (EVEN MORE RELUCTANTLY) DO THIS
try {
...
} catch(Exception e) {
// code that does something with e
throw e;
}
Explanation
This guideline is from this stackoverflow answer. InterruptedException is a signal that someone wants the current code execution to stop. If you suppress that information, either by swallowing the exception or even re-wrapping the exception in another exception and throwing that, then you have the responsibility to call Thread.currentThread().interrupt(). This call communicates the same information as the InterruptedException: someone wants the current code execution should stop.
The above examples cover the options that you have in this situation. Here they are, listed in order of simplicity/preference:
TODO: We should create library code that handles this for us and have users call that.
Only catch those checked exceptions and RuntimeExceptions that are possible.
Call Thread.currentThread().interrupt() after suppressing the Exception.
Propagate the checked exception (which according to other guidelines in this document, you should avoid doing)
Only Swallow Exceptions with Justifiable Reasons
Catching an exception and ignoring it (note: logging an exception is still ignoring it) is only permitted in the following cases:
The exception is expected or otherwise does not affect the application flow.
Given legitimate errors, there is a reasonable fallback and a mechanism for alarming based on number of occurrences.
Example
The following examples are adapted from actual code in the codebase.
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
public void resetPassword() {
int numResetAttempts;
try {
numResetAttempts = session.getAttribute(ATTEMPT_KEY);
catch (Exception e) {
numResetAttempts = 0;
}
if(numResetAttempts > MAX_ATTEMPTS_ALLOWED) {
// return error view to user that they've tried to reset their password
// too many times and need to contact support for further assistance.
}
}
In this example, the code tries to read, from the session object, the number of times the user has tried to reset their password. If it's greater than the maximum number of times allowed, then the user is shown an error page.
As written, though, if an exception is thrown, this check will always be true. This scenario allows essentially an infinite number of password reset attempts, which is insecure.
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
private static Policy policy = null;
static {
try {
policy = ...;
catch(PolicyException e) {
logger.warn(e.getMessage());
}
}
public static boolean isDangerous(String htmlIn) {
try {
return policy.scan(...);
catch(Exception e) {
logger.error(e.getMessage());
return true;
}
}
In this example, if the static policty-initializer code throws a PolicyException, policy will be set to null. Then, whenever isDangerous is called, the internals will throw a NullPointerException, log it, and then return true. As a result, all html will be considered dangerous.
This code is used for user registration and password resets. Both of those usecases will be broken. (Assuming no log scans are in place) no one will know why until a user reports the issue, potentially days after the application started. Admins will go to the logs, see potentially hundreds of NullPointerExceptions, and have to search backwards for the original cause in the static initializer.
An alternate strategy would have been to let the static intializer exception propagate exception, resulting in failures to initialize the class, which is a hard failure that people would have noticed and could resolve right away.
Explanation
As we saw in the above examples, swallowing an exception may seem like the right thing to do. After all, you're keeping the application running and not showing the user an error view. However, it can also mean that you're silently ignoring a legitimate error and, in the worst case, putting the application into an invalid state such that it still gives users a poor experience.
Your default approach should be to propagate exceptions: hard failure is a faster way to detect and deal with errors. (Ignoring situations where the exception is expected or otherwise does not affect the application, in which case swallowing is okay to do), for situations indicative of real errors, you may only swallow the exception if both of the following are true: there is a legitimate fallback behavior and there is a mechanism in place to detect the error that just happened. For the latter, this may involve logging with a specific pattern of text so that log scanners can pick it up and alarm.
As a code reader, when you see someone swallow an exception, your immediate reaction should be one of suspicion... until you prove to yourself otherwise.
Add a Justifying Comment Whenever you Swallow an Exception
Inline with the previous guideline, whenever you swallow an exception, even if you log a message, you should always add a comment explaining why swallowing is the right thing to do.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
try {
...
} catch(SomeException e) {
}
// INSTEAD, DO THIS
try {
...
} catch(SomeException e) {
log.error("Message", e);
// Swallow justification: we don't want the entire user experience to fail and have reasonable fallback behavior X for this error
}
Explanation
Swallowing exceptions is a code smell. Readers will naturally wonder whether swallowing indicates a real error that's being unjustifiably ignored and whether they should change the code to throw it.
Log with the Exception as the Second Argument
When you do log an exception, always pass the exception as the second argument.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
logger.log(exception);
logger.log(exception.getMessage());
logger.log("This is it: " + exception.getMessage());
// INSTEAD, DO THIS
logger.log("Some message, even if it's just a placeholder", exception);
Explanation
Only by passing the logger as the second argument will you get the (very useful) stacktrace. With the other two forms above, you only get the exception message itself, which is less helpful.
Log xor Rethrow an Exception but (Almost) Never Both
When you catch an exception you should (almost always) do exactly one of the following and not both: log it or rethrow.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
try {
...
} catch(SomeException e) {
log.error("Message", e);
throw e;
}
Explanation
This guideline assumes that there is some generic exception handling code at the entrance point to the application (e.g. in a Controller, in a Filter, or in the Web Server itself). This code both logs and rethrows the exception. It's the only piece of code allowed to do both.
Meanwhile, no other piece of code should do both, because if you do, you'll end up with duplicate log statements for the same exception, adding noise.
(Almost) Never Throw or Propagate Checked Exceptions
Java has it's own official policy on when to use checked exceptions, but ours is more extreme:
You can throw a checked exception from a public method only if you expect all callers to be able to handle it immediately.
You can propagate a checked exception (declared in the signature for a 3rd-party library method) from a public method only for the same reason as above.
The only exception to this rule is in test code.
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
public void myMethod() throws IOException {
throw new IOException("");
}
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
public void myMethod() throws IOException {
Files.newInputStream(...) // this method throws an IOException
}
Explanation
Checked exceptions are annoying:
You have to add them to every caller's function signature, the caller's caller's function signature, etc.... all the way up until you stop propagating them
Counter to their intent, and just like with RuntimeExceptions, there is almost never anything that a caller can do... at least not in the kind of code that we write.
No other programming language since Java has implemented this idea.
As a result, we shouldn't throw checked exceptions ourselves, and we should limit their blast radius by dealing with them immediately from code that we call, rather than allowing them to propagate.
TODO: we should write a utility class that helps transform a checked exception-throwing function into an unchecked version. Add new guidance here once that's done.
Don't Create and Throw RuntimeExceptions
Example
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
throw new RuntimeException("A problem happened here");
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
try {
...
} catch(IOException e) {
throw new RuntimeException(e);
}
Explanation
RuntimeExceptions are generic. They provide no useful extra information about the root cause of what went wrong. Instead, you should create new exception classes or reuse existing ones as appropriate.
TODO: we should write a utility class that handles transforming a checked exception into an unchecked exception. This includes an UncheckedException class, which can be used for wrapping.
Don't use Exceptions for Flow Control
Example
In this example, processor is allowed to be null: that's an acceptable state for the program to be in.
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
int number;
try {
number = processor.getNumber();
} catch(NullPointeException e){
number = 0;
}
// INSTEAD, DO THIS
int number = processor == null ? 0 : processor.getNumber();
Explanation
Exception-handling code is ugly and unnecessarily expensive compared to alternatives. If you expect the exception and have the chance to avoid it instead, you should do that.
Lombok
The following guidelines are specific to usage of the Lombok library.
Use only Accepted Lombok Features
All of the following Lombok features fall under restricted use --
val and var -- our guidelines are similar to those for openjdk itself (for Java 10's var feature, to which these guidelines will also apply in the future) but are more extreme. Coders do not have to use this feature at all, but if they do, then all of the following must be true:
The variable initializer is clear about what the variable's type will be. For example:
// DON'T DO THIS. THIS IS AN ANTI-PATTERN.
var custList = dbconn.executeQuery(query); // The right-hand size gives no type hint, and why would I trust the (incomplete) variable name on the left?
// OKAY TO DO
var outputStream = new ByteArrayOutputStream();
var document = DocumentBuilder.builder().version(...).build();
The right-hand side of the initializer must not use the diamond operator or be a generic method.
The variable must not be a literal.
SneakyThrows -- this feature is not allowed at all. It uses tricks to get around Java's (admittedly annoying) checked type system. Coders unaware of this feature will not understand what's happening, and may not catch the appropriate level of exception for their usecase. There are also real-world consequences in some (admittedly rare) scenarios. Instead of using this feature, we advocate for using utility classes and lambda expressions to the same effect (see the guidance on checked exceptions elsewhere in this document).
experimental -- all experimental features are disallowed (unless listed below). This stance is mostly from a safety perspective; if you can make an argument for a feature, feel free to give it a shot, and we'll consider allowing it. The following experimental features are allowed:
Jacksonized -- this feature gets rid of the annoying boilerplate necessary to use Builders with Jackson (which nets immutable but deserializable types).
For New Packages, Exclude Lombok-generated Code from Checks
Follow the advice here and configuration here to exclude Lombok-generated code from test coverage and findbugs/spotbugs checks. This generated code is trusted, so it's okay that we don't write tests against it and that spotbugs thinks it has issues. In short, create a file named lombok.config in the project's root directory, and include the following contents:
motivation -- explains why you're making this change. It should include a link to something the user can reference for more information (e.g. if you're assigned a GitLab issue tracking this change, link to that). Put yourself in the mind of a coder 10 years down the line.
design -- explains how you're making this change. Put yourself in the mind of the reviewer. What information would help them get a solid footing for how to review this change? What questions might they have that you could answer right away?
testing -- what testing did you do to prove to yourself that the code does what you say it does? Listing it helps you get your thinking straight, and the reviewer will want to know, too.
As a Code Submitter, Review the Code Diff Before Submitting for Review
Before you submit your change for review, look at the code diff yourself, either in git or through GitLab. Yes, you've already checked the code functionally, but you'll be surprised at how often you find something that could be improved. E.g. it could be a small change like removing excess whitespace. It could also be a larger change like forgetting to remove test values, accidentally reformatting a file, etc. Think of this step as a final quality check.
As a Code Reviewer, Feel Free to Push Back on Large Formatting Changes
If a code review has large formatting changes, unrelated to the main change, that interfere with your ability to review it, feel free to ask the submitter to undo the formatting. This is true even if the file was originally in a bizarre format and the submitter merely applied a standard format to it.
Formatting the file in a standard way is definitely a good thing, but lots of formatting changes increases the chance that some unintentional change will be introduced and decreases the likelihood that it'll be caught before being deployed. The right sequence to perform this change is to format the file in one commit and then make the desired changes in another commit (or vice-versa). The reviewer should be able to review both commits.
In the end, this is a judgement call. If the formatting applies to just a few lines, maybe you, as the reviewer, can tell the difference and don't mind. That's up to you, though. We're all working together in good faith to try and release the best code to customers as possible with the minimal effort.
The following guidelines target best/agreed practices for working with source code.
[[TOC]]
General
The following guidelines apply to all code.
These Guildelines Take Precedence over Counterexamples in the Code
Counterexamples to these guidelines that exist in the codebase are not justification for ignoring the guidelines. The guidelines were created after most of the code in the application was written. In addition, it's also not practical to expect every code review to catch every violation.
Follow Effective Java (Unless Otherwise Indicated)
Follow the tenets of the 3rd edition of the book Effective Java... except in the following cases:
Avoid Double Negatives in Conditional Expressions
Example
Explanation
The key in the above example is the presence of the
else
branch, which automatically negates theif
condition. If theif
condition is stated in a negative way, then theelse
branch results in a double negative.Double negatives are confusing to understand. E.g. "I don't not live in the United States" or "I haven't not gone to work early", both of which can be made much simpler by expressing them with the equal positive statements "I live in the United States" and "I have gone to work early (or on time)".
Use CamelCase for Acronyms and Initialisms
Example
Explanation
Using camel case consistently helps readers figure out where one concept ends and another begins. Using all capital letter may feel okay for single and well-known acronyms/initialisms, but problems arise with multiple, less-well-known strung back-to-back.
Notes:
HTTPRequest
,HTTPXMLEntity
, etc. as class names. However, you should still use camel case for the variable names (httpRequest
,httpXmlEntity
, etc.)Use the xToYs Style When Naming Map Variables
Given a Map of type X to Y, name any such Map variable as
xToYs
.Example
Explanation
The naming pattern for other collection class is well-understood: pluralize the item name (e.g.
List<Bicycle> bicycles
). Different people tend to use different strategies for Maps, though, which causes unnecessary confusion. We believe thexToYs
pattern has the most advantages, which is why we want to standardize on it.Notes:
To
concept. E.g. as with other collection classesbike
for conciseness, thebikeToPart
would be okay.oldToParts
andnewToParts
would be okay.Don't Write Long Methods
Longer methods are more difficult for readers to reason about. Readers have to maintain the context of the entire function in their head to understand it fully. The longer the method, the more difficult this is to do, and as the user modifies one section of the code, another section may be affected in an unknown way.
To avoid this issue, break larger methods down into smaller ones. Shoot for a maximum length of 20-30 lines but never more than 50. Also, don't be afraid to create a function of even only a few lines, or sometimes even one line, if you think it makes the method more readable.
Here are some good heuristics for when to move code to a new method:
Minimize Indentation in Methods (to a Max of 2 Levels)
Example
Explanation
Each level of indentation adds complexity. All of the details of every level are exposed to the reader. The reader mentally has to maintain the context of every previous level while creating a new context when encountering the next level. Do the contexts for the outer levels mix with the current one? If so, in what way? Etc.
To reduce complexity, move code blocks to new helper functions. These helper functions both create useful abstractions, allowing the reader to neglect details, and help the reader understand how different levels interact (limited through the function signature). As an example, one thing I almost always do is implement the body of any sufficiently-complex try block in a separate method.
Notes:
Try to Order Method Definitions Like a Newspaper
(From the book Clean Code) order method definitions by level of abstraction, with the caller preceding the callee.
Example
In this example, methods are not organized by access level (i.e. public vs. private vs. ...). Instead, they're organized by abstraction level... almost like a depth-first search. method1 first calls method2 and then method4, so method2's definition comes next in the file. Method2 calls method3, so method 3's definition comes next. Finally, we go back to the second method that method1 called and define method4.
Explanation
The idea is to minimize the vertical separation between dependent methods. E.g. while reading method1 in the example above, if I want to know the details of method2, then I don't have very far to go.
This should be the general rule, anyway. There may be times where there are so many methods, that a strict following of the rule makes the code less readable. There also may be times where the cohesiveness of several functions is more important. These situations should be the outliers, though.
Use "Final" When Necessary (and Only When Necessary)
The guidance aligns with this stackoverflow entry but is more extreme:
final
on instance and static fields as much as possible.final
on classes and methods if you're trying to prevent subclasses or abuse by subclasses.final
on method parameters or local variables (unless java forces you to, like it used to do for anonymous classes, even if the variable was "effectively final"). This keyword clutters the code and adds little justifying value. Yes, having immutable object references is important... but if you can't tell whether a variable is reassigned within a method, that usually means the method is too long.TODO Comments Must be Linked Somewhere Trackable
Example
Explanation
TODO comments that have no tracking link are often useless. The author may have had good intentions (e.g. wanting to improve the code). Unless someone is accountable, though, the comment often just sits in the code. Other developers often have no context and are left wondering why it exists, who plans on making this change, when it can be made, etc.
If you want to add a TODO comment, it must be accompanied by a trackable link. E.g. use a GitLab issue. The link must contain more backstory on why the change was not made in the first place and what needs to be done before it can be made. Management then needs to be made aware of this link if not already, so that they can track it and make sure it gets done eventually.
A good example of using a TODO comment would be trying to implement a project in phases. Assume you're assigned a GitLab issue for the project. Say you make changes for the first phase. As a part of this change, you add a TODO with a link to the GitLab issue for changes you want to make in the second phase. Once you get to the second phase, you make the changes and remove the TODO comment. In this case the TODO comment serves as a sort of "I know this code isn't complete, but I promise to come back and change it... and if I don't, here's how to hold me accountable."
Write Useful Comments
Add comments to code when what a piece of code does is not obvious or you want to provide more context on why it exists.
Example
Going through the examples above:
isOps
-- Assuming that "ops" is a well-understood concept, there's no need to add the comment "Determines if this domain is OPS." This just repeats the class and method names.isInternal
@return
section.Explanation
Every comment should add value. Call-outs:
Add a Justifying Comment Whenever You Suppress a Warning
Example
Explanation
Suppressing a warning is a big step. You're saying that you know better than the compiler. You could be wrong. By explaining why you think you know better, this will both help you get your thoughts straight and justify the decision to readers. If, in the future, a reader comes along and doesn't think the explanation is sound, they can feel safer modifying the code accordingly.
(Almost) Never use Parallel Streams
Don't use Java's in-built parallel stream. If you want to divide-up work, find another way to do it.
Example
Explanation
parallelStream
uses the application's single, common fork join pool for associating a thread with each task to perform. Blocking tasks make threads in the pool useless. With enough blocking tasks, overall application performance suffers, to the point of being unable to progress.Notes:
parallelStream
use a custom thread pool instead of the common one. This would be the only acceptable way to useparallelStream
. This is similar to using a "normal" thread pool for "normal" parallelization.Organize Classes in Java Packages Logically
Organizing code into classes can be difficult enough. Organizing those classes into Java packages can seem more difficult, but with similar principles, it doesn't have to be. Here are some heuristics you can use:
Read this Article
The article describes four different ways to think about organizing classes into Java packages: by component (aka feature), layer, toolkit, and kind. You can even use those ways in combinations to get the best of both worlds. For example, some web apps divide code first by feature and then by layer. (Note: as stated in the article, be extremely wary about organizing code by kind.)
Strive for Logical Cohesion
The classes inside of a Java packages should be logically cohesive.
What is the purpose of the Java package? Try to describe it in a single sentence. Does a class contribute to that purpose? If so, consider including it.
Would the majority of Java package consumers want to use that class? If so, consider including it.
Don't make Java Package Dependencies Circular
Classes in one Java package (call it
Consumer
) will often depend on those in another Java package (call itDependency
). All of those dependencies should point one direction: don't have theDependency
package also depend on theConsumer
package.Why? Circular dependencies blur the line between the Java packages:
Consumer
can't do what it does withoutDependency
... but if the opposite were also true, there would be no distinct identity between the two. The single Java package would really just be masquerading as two. (As a thought exercise imagine that each Java package went in its own Maven package, which forbids circular dependencies.)Strive for reuse
Imagine the Java package as providing additional functionality that any other package can utilize simply by depending on it.
Isolate Transitive Dependencies
The easiest way to think about this is to consider Maven package dependencies. The more transitive Maven packages that a Java package includes:
Use Maven dependencies as a heuristic for deciding when to create another package. E.g. imagine that you're developing a Java package without any Maven dependencies so far. Suddenly, you have to add a dependency on an XML parsing library. Not all clients may want this XML functionality. So, ideally, you don't want to force this dependency on them: give consumers a choice. Create another Java package and consider that it's "adding an XML feature to the core Java package you've been creating".
Note: sometimes/usually this is just a thought exercise. To work in reality, this scheme would require a separate Maven package for the separate Java package. That's the only way clients could decide to exclude the transitive Maven dependency. For simplicity, our code doesn't actually do this.
Avoid Generic Constants Classes
Sometimes, it can feel easy to create a single Constants class for the entire application and put all constants inside of it. There are several problems with this organization:
To avoid this, think about the following heuristics for deciding where a Constant should go.
Identify a Logical Owner
Is there a single, logical class you can identify for containing this Constant, such that all consumers can depend on it?
For example, consider the maximum value that an int can take on in Java. This value is defined inside of the
Integer.MAX_VALUE
class itself.Integer
makes sense as the logical owner of the class. The JDK, itself, doesn't contains a generic Constants class that defines this value.Now, as a counter example, you may point to the
Collections
class, which seems to contain a lot of different types of Constants (e.g. empty lists, sets, maps, etc). You could make the case that those constants rightfully belong elsewhere (e.g. you could argue that the empty list belongs on the List class). You wouldn't be wrong there. However, that doesn't mean thatCollections
is wrong either. It's serving as a facade, exposing a group of constants that have some logical cohesion: they're all collections of the types defined by the JDK itself. There is some meaning to them, and users would know to go there to find them.Look to Class Organization
In large methods, people will often group code into blocks and add a comment to describe what the block does. This (as mentioned elsewhere in this document) is a heuristic for deciding when to break-up a large method into smaller ones. The same is true of Constants classes.
If you find yourself vertically separating a group of constants and adding comments to explain what that group is, that's a good sign that this group belong in another class. Each Constant may have its own owner, or all of the Constants may actually belong together... but just in their own class.
Use enums
Using enums over primitive or library classes has the following benefits:
Logging
The following guidelines apply to logging.
Don't Log Sensitive Information
When you log, make sure that you're not logging sensitive data, even if it would be helpful to debug. Examples include Personally Identifiable Information (PII), credit card information, etc. Besides obfuscating sensitive information directly in the log statement, for complex objects, make sure all toString implementations obfuscate as well.
Log When Useful, no Less and no More
If you log too little, you won't have enough information to debug issues in production. If you log too much, you'll add noise to the logs, making debugging more difficult. Hitting the sweet spot can be difficult, so here are some rules that can help:
Who, What, When, Where, Why, and How
(From this article) use these six questions as guidance. Can you answer them all with the current logging in place?
Run Thought Experiments
Imagine potential failure scenarios in production. What questions would you ask yourself (e.g. what was the value of variables X, Y, etc.)? Are there enough logs, and do the logs contain enough data to help you answer those questions?
Remember and Capture Context for Exceptions
Exceptions provide useful debugging information, but as you let them propagate, they can lose context. E.g. a NullPointerException from a constructor is useful, it only captures the variable that was null. What about the method that called the constructor? What were its arguments? As necessary, you can supplement exception information by catching it and wrapping it in a new exception, using the new message to capture that context. That way, when the exception is logged, it will provide what you need. (Note: Spring does a great job of this.)
Exception Handling
The following guidelines apply to exception handling.
Don't Catch the Exception Class Without Either Propagating It or Accounting for InterruptedException
Whenever you catch Exception (or InterruptedException directly), you have extra work to do.
Example
Explanation
This guideline is from this stackoverflow answer. InterruptedException is a signal that someone wants the current code execution to stop. If you suppress that information, either by swallowing the exception or even re-wrapping the exception in another exception and throwing that, then you have the responsibility to call
Thread.currentThread().interrupt()
. This call communicates the same information as the InterruptedException: someone wants the current code execution should stop.The above examples cover the options that you have in this situation. Here they are, listed in order of simplicity/preference:
Thread.currentThread().interrupt()
after suppressing the Exception.Only Swallow Exceptions with Justifiable Reasons
Catching an exception and ignoring it (note: logging an exception is still ignoring it) is only permitted in the following cases:
Example
The following examples are adapted from actual code in the codebase.
In this example, the code tries to read, from the session object, the number of times the user has tried to reset their password. If it's greater than the maximum number of times allowed, then the user is shown an error page.
As written, though, if an exception is thrown, this check will always be true. This scenario allows essentially an infinite number of password reset attempts, which is insecure.
In this example, if the static policty-initializer code throws a PolicyException, policy will be set to null. Then, whenever isDangerous is called, the internals will throw a NullPointerException, log it, and then return true. As a result, all html will be considered dangerous.
This code is used for user registration and password resets. Both of those usecases will be broken. (Assuming no log scans are in place) no one will know why until a user reports the issue, potentially days after the application started. Admins will go to the logs, see potentially hundreds of NullPointerExceptions, and have to search backwards for the original cause in the static initializer.
An alternate strategy would have been to let the static intializer exception propagate exception, resulting in failures to initialize the class, which is a hard failure that people would have noticed and could resolve right away.
Explanation
As we saw in the above examples, swallowing an exception may seem like the right thing to do. After all, you're keeping the application running and not showing the user an error view. However, it can also mean that you're silently ignoring a legitimate error and, in the worst case, putting the application into an invalid state such that it still gives users a poor experience.
Your default approach should be to propagate exceptions: hard failure is a faster way to detect and deal with errors. (Ignoring situations where the exception is expected or otherwise does not affect the application, in which case swallowing is okay to do), for situations indicative of real errors, you may only swallow the exception if both of the following are true: there is a legitimate fallback behavior and there is a mechanism in place to detect the error that just happened. For the latter, this may involve logging with a specific pattern of text so that log scanners can pick it up and alarm.
As a code reader, when you see someone swallow an exception, your immediate reaction should be one of suspicion... until you prove to yourself otherwise.
Add a Justifying Comment Whenever you Swallow an Exception
Inline with the previous guideline, whenever you swallow an exception, even if you log a message, you should always add a comment explaining why swallowing is the right thing to do.
Example
Explanation
Swallowing exceptions is a code smell. Readers will naturally wonder whether swallowing indicates a real error that's being unjustifiably ignored and whether they should change the code to throw it.
Log with the Exception as the Second Argument
When you do log an exception, always pass the exception as the second argument.
Example
Explanation
Only by passing the logger as the second argument will you get the (very useful) stacktrace. With the other two forms above, you only get the exception message itself, which is less helpful.
Log xor Rethrow an Exception but (Almost) Never Both
When you catch an exception you should (almost always) do exactly one of the following and not both: log it or rethrow.
Example
Explanation
This guideline assumes that there is some generic exception handling code at the entrance point to the application (e.g. in a Controller, in a Filter, or in the Web Server itself). This code both logs and rethrows the exception. It's the only piece of code allowed to do both.
Meanwhile, no other piece of code should do both, because if you do, you'll end up with duplicate log statements for the same exception, adding noise.
(Almost) Never Throw or Propagate Checked Exceptions
Java has it's own official policy on when to use checked exceptions, but ours is more extreme:
Example
Explanation
Checked exceptions are annoying:
As a result, we shouldn't throw checked exceptions ourselves, and we should limit their blast radius by dealing with them immediately from code that we call, rather than allowing them to propagate.
TODO: we should write a utility class that helps transform a checked exception-throwing function into an unchecked version. Add new guidance here once that's done.
Don't Create and Throw RuntimeExceptions
Example
Explanation
RuntimeExceptions are generic. They provide no useful extra information about the root cause of what went wrong. Instead, you should create new exception classes or reuse existing ones as appropriate.
TODO: we should write a utility class that handles transforming a checked exception into an unchecked exception. This includes an UncheckedException class, which can be used for wrapping.
Don't use Exceptions for Flow Control
Example
In this example, processor is allowed to be null: that's an acceptable state for the program to be in.
Explanation
Exception-handling code is ugly and unnecessarily expensive compared to alternatives. If you expect the exception and have the chance to avoid it instead, you should do that.
Lombok
The following guidelines are specific to usage of the Lombok library.
Use only Accepted Lombok Features
All of the following Lombok features fall under restricted use --
val
andvar
-- our guidelines are similar to those for openjdk itself (for Java 10'svar
feature, to which these guidelines will also apply in the future) but are more extreme. Coders do not have to use this feature at all, but if they do, then all of the following must be true:The variable initializer is clear about what the variable's type will be. For example:
The right-hand side of the initializer must not use the diamond operator or be a generic method.
The variable must not be a literal.
SneakyThrows
-- this feature is not allowed at all. It uses tricks to get around Java's (admittedly annoying) checked type system. Coders unaware of this feature will not understand what's happening, and may not catch the appropriate level of exception for their usecase. There are also real-world consequences in some (admittedly rare) scenarios. Instead of using this feature, we advocate for using utility classes and lambda expressions to the same effect (see the guidance on checked exceptions elsewhere in this document).Jacksonized
-- this feature gets rid of the annoying boilerplate necessary to use Builders with Jackson (which nets immutable but deserializable types).For New Packages, Exclude Lombok-generated Code from Checks
Follow the advice here and configuration here to exclude Lombok-generated code from test coverage and findbugs/spotbugs checks. This generated code is trusted, so it's okay that we don't write tests against it and that spotbugs thinks it has issues. In short, create a file named
lombok.config
in the project's root directory, and include the following contents:Code Submission/Review
This following guidelines apply to code submission/review.
As a Code Submitter, use a Standard Template for the Commit Message
Use the following template for your commit message:
Here is what each section means:
As a Code Submitter, Review the Code Diff Before Submitting for Review
Before you submit your change for review, look at the code diff yourself, either in git or through GitLab. Yes, you've already checked the code functionally, but you'll be surprised at how often you find something that could be improved. E.g. it could be a small change like removing excess whitespace. It could also be a larger change like forgetting to remove test values, accidentally reformatting a file, etc. Think of this step as a final quality check.
As a Code Reviewer, Feel Free to Push Back on Large Formatting Changes
If a code review has large formatting changes, unrelated to the main change, that interfere with your ability to review it, feel free to ask the submitter to undo the formatting. This is true even if the file was originally in a bizarre format and the submitter merely applied a standard format to it.
Formatting the file in a standard way is definitely a good thing, but lots of formatting changes increases the chance that some unintentional change will be introduced and decreases the likelihood that it'll be caught before being deployed. The right sequence to perform this change is to format the file in one commit and then make the desired changes in another commit (or vice-versa). The reviewer should be able to review both commits.
In the end, this is a judgement call. If the formatting applies to just a few lines, maybe you, as the reviewer, can tell the difference and don't mind. That's up to you, though. We're all working together in good faith to try and release the best code to customers as possible with the minimal effort.