HumanLearning2021 / HumanLearningApp

GNU General Public License v3.0
3 stars 2 forks source link

Code review - Human Learning #151

Open codeofmochi opened 3 years ago

codeofmochi commented 3 years ago

# Code review

This is a code review for the code of the whole project on main branch at commit fdae25e1b1

Of course, we don't expect you to change all of your code (some parts of the review can be subjective or arguable anyway), what’s important is that you understand why better alternatives may exist.

Please ask us if you have any questions about the code review, whether it is by writing a comment below or by sending us an email or meeting with us.

Important : if you try to fix these, don't break your app. We prefer that the app works even if you don't manage to fix anything. However, we still expect you to refactor your code as much as possible.

General remarks

Packages structure

In general your packages are nicely structured and there is a clear hierarchy defined.

You may introduce subpackages for interfaces, so as to further elaborate the hierarchy, i.e. one subpackage for the interfaces and one for the implementing classes (often called interfaces and impl). It seems weird to have your database abstractions located in the model package, consider adding a providers or services top-level package and moving these files there, and in which you could then have implementation subpackages interfaces, firestore, and offline for instance.

Documentation

We noticed that a lot of classes and methods lack documentation. We do require you to properly document all methods and classes. This holds for every team member. Even UI classes, which you consistently do not comment on, will greatly benefit from comments, as they are difficult to read and understand.

We therefore had some trouble understanding your code and evaluating its implementation as without formal specification of what you’re trying to accomplish, it is very difficult to verify if it adheres to it.

All public classes, methods and fields must be documented. Use KDoc conventions: a top-level block for general description and metadata, and per-method header with annotations (@param, @return, @throws, ...). You can omit KDoc on overriden methods if they are themselves documented. Regarding the implementation itself, of course you shouldn’t comment on each line of code, but rather only explain sections of code that cannot be understood just by reading the code with the help of a relevant comment.

Code style

It seems you don’t use the integrated auto-formatter. You can automatically format all of your code by right-clicking on the com.github.HumanLearning20201.HumanLearningApp package and hitting “Reformat code” and “Optimize imports”, and with Ctrl+Alt+L and Ctrl+Alt+O on a per-file basis. Ideally, you should do it at least every time you finish a feature, before requiring a merge to the main branch.

In particular, you seem to very often exceed the default wrapping limit of 80 columns, which makes your code very “wide”. You also have inconsistent use of spaces and indentation, as well as many unused imports. Use the auto-formatter and import optimizer to ensure sane conventions across your team, as well as all Android developers (for instance if a new developer was to join the team in the future).

You often don’t use the functional features of Kotlin. This makes your code much more verbose (for instance you often rebuild mutable collections to filter them, instead of simply calling .filter with a predicate). Kotlin is heavily inspired by Scala, so you should remember from your functional programming classes that these features will help you reduce boilerplate and chain operations using function composition. The collections API is very similar to Scala’s.

Code quality & performance

Leverage your IDE to do the hard work for you: you should also use the integrated lint issues detection and code analysis by using Analyze > Inspect Code > Whole project. Some reported problems may not actually be important, however it seems you have many Kotlin warnings. Make sure to fix as many of those lints as possible.

For instance, you have some instances of redundant null checks and nullable return types that can be converted to strict types. It is a good thing that you practice defensive programming, however here the compiler will guarantee stricter rules about the soundness of your types to be enforced, such as non-nullable types that can never be null.

It seems you use exceptions in a lot of places in your code. In Kotlin, this is an anti-pattern (see https://kotlinlang.org/docs/exceptions.html) and you should avoid relying on them to express potential return outcomes from functions. Instead, you should use nullable types (if your error can be summarized with the Option={Some|None} pattern) or the Result monad if more complex cases are needed. Note also that try/catch is a valued expression in Kotlin and not a statement.

In particular, since you use coroutines which are dispatched as asynchronous operations, it is a bad idea to mix synchronous constructs (such as exceptions) with asynchronous execution. By using the return value instead as an indicator of potential failure, you ensure that you obtain any possible outcome at the expected return time (i.e. when the coroutine yields back to the main thread).

You also use a lot of non-null assertions (!!) that may crash the app. Consider using the smart features of null safety in Kotlin and prefer using the Elvis operator ?., ?.let checks and ?.mapNotNull and similar for collections. Also, you should adopt a convention on when to use nullable types, and when to ensure null safe types (you have inconsistent return types regarding nulls within each file).

Class hierarchy

You define your models using interfaces: in particular, they are all defined as interfaces containing vals to be overridden. This is very common in structural subtyping languages (such as Typescript or Python), however this does not scale well with nominal subtyping based languages, which Java and Kotlin belong to.

Indeed, you redefine the shape of the objects for each specific underlying use case: for instance, interface Category is overridden by concrete implementations FirestoreCategory, DummyCategory and OfflineCategory. First, this way of writing code generates a lot of files and boilerplate (we can see all your models are duplicated in each service package). It also means that your implementation of business logic is never independent: it makes it difficult to reason about with domain-specific code, and changing providers would be a pain: imagine your client in the future decides that you should not use Google products anymore, you would hence need to rewrite all your data classes.

In nominally-typed languages, it is better to Keep your models Simple Stupid. As such, entities are mostly modeled using Plain Old Java/Kotlin Object (POJO / POKO), and this makes sense as you want your data to have an instantiable representation which you can manipulate purely with your language of choice. This makes your business logic independent from the rest of your systems (UI, storage, …). If you need an extended representation for storage, consider using the Adapter or Decorator pattern instead. Also, using plain objects will remove the need for your DummyX classes, which will become the actual models themselves. We strongly recommend you implement your models using Kotlin data classes.

Note that this problem would not appear in a structurally-typed language: since the compiler would inspect the exact shape of the object and its properties, you wouldn’t need to explicitly redefine the relationship between your “core” object (i.e. the mathematical representation of your entity stripped of any implementation-specific property) and the extended objects that are actually stored on the underlying medium. This is why you may often see models defined as interfaces in e.g. Typescript, but rarely in Kotlin or Java. In those languages, interfaces usually model behaviour, while (abstract) classes model state.

Classes

DatabaseManagement and DatabaseService

These 2 interfaces are extremely similar, and each have concrete implementations. We don’t really understand their purpose: it seems they are huge interfaces to define access methods to the database.

Both interfaces are extremely similar: in fact, almost all signatures are identical, and DatabaseService defines a few more to manage users. This is a nasty code smell: it either means you duplicate functionalities, or that the boundaries of your modules are not clearly defined and overlap. We can see that in all your concrete implementations of DatabaseManagement, you mostly forward the calls to the corresponding instance of DatabaseService, and we do not understand why you add this overhead. To us it reads like an indirection that doesn’t provide abstraction.

It also feels like these interfaces are becoming “god classes” which will contain anything related to data access. Although this is fine for the scope of your project, it will not scale to dozens or hundreds of entity types. You may want to have a look at the Repository pattern, which groups accesses per entity or aggregate of entities, see https://docs.microsoft.com/en-us/dotnet/architecture/microservices/microservice-ddd-cqrs-patterns/infrastructure-persistence-layer-design, which in your case would group for instance picture accesses into one Repository, datasets into another, etc...

We argue that the interface/class name DatabaseManagement is not semantically precise, and seems to highlight the issue we explained above. If you name something a Manager, it probably means that either its purpose is too vague, or that a better name can be found. Have a look at this article: https://blog.codinghorror.com/i-shall-call-it-somethingmanager/

In your DatabaseService, your method updateUser takes a FirebaseUser as input: this should not happen in an interface, because it closely couples the behaviour to the proprietary types from Firebase. It should take a User instead. In general, interfaces should allow you to swap the underlying systems very easily as it is their goal to abstract the implementation details away.

Otherwise, the interfaces are well documented and you did a good job abstracting the access to Firebase. This is the kind of documentation that we would like you to provide throughout your code.

FirestoreDatabaseManagement

FirestoreDatabaseService

CachedFirestoreDatabaseManagement

FirestoreCategorizedPicture

FirestoreCategory

FirestoreDataset

FirestoreDocument

FirestoreUser

modules

CategorizedPicture

Category

Converters

Dataset

DummyCategorizedPicture

DummyCategory

DummyDatabaseManagement

DummyDatabaseService

DummyDataset

DummyUser

UniqueDatabaseManagement

User

CachePictureRepository

OfflineCategorizedPicture

OfflineCategory

OfflineConverters

OfflineDatabaseManagement

OfflineDatabaseService

OfflineDataset

OfflineUser

PictureRepository

AuthenticationPresenter

LearningPresenter

CrossRefs

Daos

Entities

Relations

RoomOfflineDatabase

RoomTypeConverters

AddPictureFragment

CategoriesEditingFragment

DatasetsOverviewFragment

DisplayDatasetFragment

DisplayImageFragment

DisplayImageSetFragment

SelectPictureFragment

TakePictureFragment

DatasetListRecyclerViewAdapter

DatasetListWidget

LearningAudioFeedback

LearningDatasetSelectionFragment

LearningFragment

LearningSettingsFragment

GoogleSignInWidget

HomeFragment

MainActivity

Testing

Conclusion

Overall, your code has several flaws that make it hard to read and to maintain. Although you had some great ideas for abstracting and modularizing your system, their implementations are clunky which in the end do not solve the complexity problem.

For instance, you should probably refactor your models so that their definitions are data classes and not interfaces, since Kotlin is not structurally sub-typed. Otherwise, this generates a lot of duplication and boilerplate. Similarly, although your database interfaces describe the behavior nicely, you have a confusion between your so-called “DatabaseService” and “DatabaseManagement”, which you could solve more elegantly with the Repository pattern. Many problems result from these design decisions, such as type leakages and contract violations and class roles duplication.

You should also attempt to simplify your dependency injection: it should lessen the burden on the programmer for binding dependencies. In general, abstractions are great but they should only be used if they can provide a new meaningful name to a concept and hide implementation details.

Another example is many code snippets can be simplified through the use of the functional API readily available on the Kotlin collections. Null safety and exceptions are also completely different from Java, so you may want to read a bit more about them in the Kotlin docs.

As a general guideline, try to read a bit more examples for instance from the Android documentation, as well as the practices for the environment you work in (each framework and each language has its own quirks and ways to do things). You may also find out that many of the problems that you face have already been solved through libraries, design patterns or recommended best practices.

Please make a habit to write documentation as you write code: this will also help your teammates in getting familiar with the whole codebase. Also, make sure to be thorough in your code reviews: when you write or review code, you must take responsibility for it, i.e. try to put yourself in the shoes of someone else that has never seen your code before: is it readable? Is it easy to reason about? Are modules self-contained? Which classes should handle what? Try to draw clear boundaries between your functionalities.

Let us know if you have any questions or remarks with the comments below.

bbjubjub2494 commented 3 years ago

In particular, since you use coroutines which are dispatched as asynchronous operations, it is a bad idea to mix synchronous constructs (such as exceptions) with asynchronous execution. By using the return value instead as an indicator of potential failure, you ensure that you obtain any possible outcome at the expected return time (i.e. when the coroutine yields back to the main thread).

I take issue with the statement that exceptions are synchronous constructs in the specific context of Kotlin coroutines: according to the official documentation, exceptions can be used in coroutines by design, and they will propagate to the semantic caller of the function (or the parent coroutine if applicable). This also reflected in our experience with the framework: we never had exceptions leak out of the coroutine and we were able to reliably catch them upstack.

Of course you presented other reasons to use a pseudomonad which we will consider.

Nitwix commented 3 years ago

Thanks for the feedback! It already seemed to me that DatabaseService & DatabaseManagement didn't really make lots of sense to have separately. We will probably refactor this if the team decides it is worth the time it will take. The suggestion to change interfaces to data classes where applicable also makes a lot of sense to me, and won't take too long to refactor I think. Regarding the use of functional APIs, I don't know if we'll take the time to change everything in the code that we have already written, but we will certainly try to use it more in the future.

Would you be ok if we do a refactoring sprint in the future, where we mostly focus on refactoring ? Or does this go against the Scrum principles ?

jonaskon commented 3 years ago

It seems your offline mechanism relies on downloading entire databases, and keeping track of these “meta-entities”. While this may be useful for complex systems with versioning, it seems over-engineered for the scale of your app. Wouldn’t it make more sense to download the entities themselves and simply implement some synchronization mechanism?

I do not really understand this comment, we wanted to download an entire database by design. The objective is to be able to give a user the possibility to learn entirely offline (i.e the same way you would download a Spotify playlist) and the best way I found to do so was using the Room library. Do you mean that it would have been better to do it differently?

jonaskon commented 3 years ago

Some of the (very valid) comments about the whole DatabaseService, DatabaseManagement... structure have been addressed by the refactoring done in last weeks sprint (pull request #138) I believe.

codeofmochi commented 3 years ago

I take issue with the statement that exceptions are synchronous constructs in the specific context of Kotlin coroutines: according to the official documentation, exceptions can be used in coroutines by design, and they will propagate to the semantic caller of the function (or the parent coroutine if applicable). This also reflected in our experience with the framework: we never had exceptions leak out of the coroutine and we were able to reliably catch them upstack.

This is true, however there are subtleties regarding the exact handling mechanism. See https://kotlinlang.org/docs/exception-handling.html. Of course it is very possible to rely on exceptions in Kotlin, at the end it's a matter of choice and consistency. As a side note, functional advocates sometimes consider exceptions to be the reincarnation of goto as they make it very easy to break control flow. The idea is that return types render error handling explicit, so as to avoid stacktrace hell while maintaining the project as the code grows. Kotlin doesn't support checked exceptions for that reason, and generally reserves this mechanism for runtime failures that cannot be caught at compile time.

Would you be ok if we do a refactoring sprint in the future, where we mostly focus on refactoring ? Or does this go against the Scrum principles ?

As you have guessed, with Scrum you should deliver new features every sprint (generally the customer does not see refactors, and he may not even understand the value behind them). One great way to explain them in that scenario is to show that time and money will be saved in the future by doing so. For this course, we recommend you pick a few new light features while refactoring.

I do not really understand this comment, we wanted to download an entire database by design. The objective is to be able to give a user the possibility to learn entirely offline (i.e the same way you would download a Spotify playlist) and the best way I found to do so was using the Room library. Do you mean that it would have been better to do it differently?

This is a valid approach and there's nothing inherently wrong with it. We simply wanted to suggest a simpler alternative as your code to handle that is relatively complex.

Let us know if you have any more questions and keep up the good work :slightly_smiling_face: