johnb9682 / tpcc-soccer-manager-service

Spring Boot backend service
2 stars 1 forks source link

Code Quality/Security/Performance Improvement Proposal #6

Open MaxPoon opened 2 years ago

MaxPoon commented 2 years ago

I took a look at this backend codebase and want to share some ideas on potential improvements. But I've never used spring boot and hibernate before. So bear with me if any of these suggestions don't make sense :) And I'm happy to discuss more under this thread

Code Quality

Don't do the manual "time math"

There are multiple code blocks in the codebase having the manual "time math" like this:

Timestamp createTime = new Timestamp((System.currentTimeMillis()/1000)*1000L);

There're a few reasons why we should avoid this:

If you really have to do time conversion, use the existing APIs in java.time.*:

Timestamp createTime = new Timestamp(Instant.now().toEpochMilli());

But in this codebase, most of the timestamp values are created to be stored in the database as create/update time:

@Column(name = "create_time")
@NotNull
@jdk.jfr.Timestamp
private Timestamp createTime;

You can let Hibernate handle it for you automatically using @CreationTimestamp/@UpdateTimestamp

@Column(name = "create_time")
@NotNull
@jdk.jfr.Timestamp
@CreationTimestamp
private Timestamp createTime;

Use enum to represent status

Currently we use integer to represent different status in the codebase.

@Column(name = "status")
@NotNull
private Integer status; // 0 is pending; -1 is rejected; 1 is accepted;

This approach could work, but is not ideal:

It'd be better to use enum to represent status:

// To make sure the enum values are always mapped to the same integers,
// do not remove existing values in this enum, and only append new values at the bottom of the list.
public enum Status {
   PENDING,
   REJECTED,
   ACCEPTED
}

@Column(name = "status")
@NotNull
private Status status;

This is supported by Hibernate. See more at: https://thorben-janssen.com/hibernate-enum-mappings/

Then the code reads like human language when you use the field - if (invitation.getStatus().equals(ACCEPTED)).

Classes should be final by default

Designing and documenting for inheritance is hard and error-prone. In the following example, it's hard to figure out what type of helper instance we are getting if Helper is a non-final class that can be sub-classed:

Helper helper = foo.getHelper();

If we really want a class to be sub-classable, we can carefully design the abstract base class. But most of time, we can simply mark the class as final to prohibit inheritance,e.g. change

public class TeamController {}

to

public final class TeamController {}

Public method should return Optional instead of nullable value

Sometimes we return null from a method:

if (id == -1) return null;

Without looking into the implementation details, the callers won't know it may get null. And this can easily cause NullPointerException. Tony Hoare (the author of null) admitted this is a billion dollar design mistake :)

An approach to prevent this is to mark the method as @Nullable:

@Nullable
public EventParticipantResponse deleteParticipant(int userId, int eventId) {}

Many IDEs (e.g. IntelliJ) would show a warning if the caller use the returned value without a null check. However, most developers ignore warnings...

So an even better approach is to let the method return Optional:

public Optional<EventParticipantResponse> deleteParticipant(int userId, int eventId) {
   if (id == -1) return Optional.empty();
   // ...
   return Optional.of(eventParticipantResponse);
}

This will force the callers to explicitly handle the null case:

Optional<EventParticipantResponse> response = deleteParticipant();
if (response.isPresent())  {
// do something with response.get()
} else {
// do something else
// It throws an exception if you call response.get() here
}

Use immutable collections by default

Most of the java built-in collections are mutable (e.g. you can mutate an ArrayList by calling .add()/.remove(). But it's dangerous to use mutable data in a public API. For example, the following code snippet looks good at the first glance. It caches data read from the database and returns the cached values to callers:

public final class Foo {
   private List<String> cache;

   public List<String> getValues() {
      if (cache == null) {
         cache = fetchValuesFromDb();
      }
      return cache;
   }
}

However, if one of the callers modifies the returned value: foo.getValues().add("a random string");, getValues() will start to return the wrong value to the other callers.

We can avoid this problem by returning an ImmutableList instead. It throws an exception if the caller tries to mutate the data:

public final class Foo {
   private ImmutableList<String> cache;

   public ImmutableList<String> getValues() {
      if (cache == null) {
         cache = ImmutableList.copyOf(fetchValuesFromDb());
      }
      return cache;
   }
}

There're two more positive side effects of using immutable collections:

Methods should be private by default

Currently most of the methods in the codebase are public. When you implement a class, the public methods are the APIs you allow the callers to use. We want to keep the public API as simple as possible so that the callers (your teammates) don't have to spend a long time figuring out which method they should use. That said, the methods should be private unless they are designed to be public.

Security

Don't store passwords in plaintext

Don't store passwords in plaintext in the database. Otherwise, hackers can pretend to log in as a real user if they manage to retrieve the password from your database.

The appropriate way is to store the hash value with salt. Every time you need to verify a user, you hash their password again and compare the hash value with the one you store in your database. Even if the hacker manages to retrieve the data from your database, they can't log in with a hashcode :) See more at: https://www.vaadata.com/blog/how-to-securely-store-passwords-in-database/

Verify user identification before performing the operations

Currently we don't have any user identification logic for all the controller endpoints. A hacker can directly send a request to any of these endpoints without going through any authentication process. For example, the hacker can send a request to TeamController.deleteTeam() endpoint to delete a team. We should verify the client has the valid user token and has the authority to perform such operation. See more at: https://spring.io/projects/spring-security

Performance

Don't read the whole database table

It's a common pattern in this codebase to read the whole database table into a list and find one entity in that list:

List<TeamMember> members = (List<TeamMember>) teamMemberRepository.findAll();
int id = -1;
TeamMember member = new TeamMember();
for (TeamMember tm: members){
   if (tm.getUserId() == userId && tm.getTeamId() == teamId) {
      if (tm.getIsLeader() == 1){
         throw new LeaderException();
      }
      id = tm.getTeamMemberId();
      member = tm;
      break;
   }
}

This is fine when you only have a few entities in your table. But if you want to build a scalable system that can support millions of users, this is not going to work. You will quickly run out of CPU/memory/network resources if you scan through millions of rows for each user request. And that user request will probably timeout. For the example above, you can define a TeamMemberRepository.findByUserIdAndTeamId(int userId, int teamId) method to do the same thing. In that way, the database can quickly find the entity if the table is properly indexed. And the server only needs to fetch one entity from the database. It becomes a much more affordable operation.

johnb9682 commented 2 years ago

@MaxPoon Thank you for the valuable suggestions @JaykingLJK @Song0180 @WangZixin67 Lets try to change our code based on Max Da Ge's feedback

JikunLiu101 commented 2 years ago

@MaxPoon Thank you Max Da Ge for the suggestions!