Make entities immutable

jodosha commented 8 years ago

Please note that this is a proposal. We want to discuss first and evaluate if this will be implemented and included in the next release.

Mutable state is known to be difficult to handle and dangerous for concurrency.

This is a proposal to remove the following features from Hanami::Entity:

Setters
#update
Hanami::Entity::DirtyTracking

Besides to these changes, we should probably amend Repository#update signature by accepting the id and the data.

If we follow the principle "data in, entities out" (described by https://github.com/hanami/model/issues/291), entities are only the output of database operations.

Please check one by one the following CRUD examples and spot the points where we need to mutate an entity.

Create

def call(params)
  book = BookRepository.new.create(params[:book])
  redirect_to routes.books_url(id: book.id)
end

Read

expose :book

def call(params)
  @book = BookRepository.new.find(params[:id])
end

The associated view/template only needs getters to present data.

Update

def call(params)
  book = BookRepository.new.update(params[:id], params[:book])
  redirect_to routes.books_url(id: book.id)
end

Delete

def call(params)
  book = BookRepository.new.destroy(params[:id])
  redirect_to routes.books_url
end

For default CRUD operations we don't need to mutate the state of an entity. I'm looking for feedback to understand where mutation (and dirty tracking) can be still useful.

/cc @hanami/core-team @hanami/contributors

davydovanton commented 8 years ago

I liked this :+1: update method looks much better

booch commented 8 years ago

:+1: You guys seem to be headed in the same direction I'm wanting to go.

If mutability is found to be needed in some corner case, it should be opt-in. Or even better, have a copy/dup method that takes a hash of which attributes to change in the new copy.

solnic commented 8 years ago

I'm so happy to this proposal! :+1: * 1000

pascalbetz commented 8 years ago

Useful for:

audit trails
user guidance (i.e. please change your password)
performance (don't trigger update statement when there is no change)

Which can of course all be done without dirty tracking from entities. But might be more complex.

pascalbetz commented 8 years ago

But i like the idea:-)

jodosha commented 8 years ago

@pascalbetz

audit trails user guidance (i.e. please change your password)

Can you please expand these examples? Maybe with some pseudocode. Thank you! :smile:

runlevel5 commented 8 years ago

:100: on this proposal

pascalbetz commented 8 years ago

@jodosha

Audit Trails: with Rails dirty tracking it is easy to find what attributes have changed and store in a way such that one reconstruct the changes later on (who did what when on what model)

If you ask a user to change his password, then you need to know the old and the new password. In Rails it is, due to dirty tracking, just password_changed?

accuser commented 8 years ago

Assuming that the repository is already going to be filtering out unchanged attributes when performing updates, this is an ideal place to perform audits or provide mechanisms to support this easily (after :update, :log_update anyone?)

pascalbetz commented 8 years ago

@accuser

How would the repository do this without an additional select? (I have to admit that this is required in rails as well (because entities are persisted, not hashes))

accuser commented 8 years ago

@pascalbetz good point, I misread the examples. :weary:

I was assuming that the parameters passed to update would contain the only changes, but it might represent unchanged attributes.

accuser commented 8 years ago

Before making any decision, it would be pertinent to decide what an entity actually represents. If an entity is immutable, it is, in effect, a representation of the current state of the persisted record. In order to affect any change, we need to modify the underlying record first, and then reload the entity.

If, on the other hand, an entity is mutable, it represents the intended state of the record, i.e. changes made to the entity are then persisted.

There are benefits and drawbacks to both approaches. An immutable entity is clean, but I could just as well do with an immutable hash and some getters.

A mutable entity is useful, but then I run the risk of business logic creeping into the entity. From a framework developers point of view, I can understand why this is undesirable. From a developers point of view, I should know better. :wink:

accuser commented 8 years ago

Here is a classic User entity with a secure password:

class User
  include Hanami::Entity

  attributes :password_digest

  def password=(password)
    self.password_digest = BCrypt::Password.create(password)
  end

  def password
    BCrypt::Password.new(password_digest)
  end
end

This differs from Rails' has_secure_password in that the process-specific logic for authentication is removed. I would argue that the transformation of a plaintext password into a digest is entity-specific logic.

With an immutable entity, this becomes impossible. Therefore I have to encrypt the password outside of the entity and ensure that everywhere I wanted to authenticate the user that I used the same encryption method. It is entirely possible that the model takes care of this in the mapping (i.e. some helper class to marshall a password attribute), but this seems awkward, and slightly disingenuous.

Feel free to tell me I'm wrong.

Bounga commented 8 years ago

I don't understand how we can be aware of what changed with this solution. At least, I can see any simple solution. Tracking attributes changes is a must have. I often need it in my apps to log changes to be able to reconstruct the history of objects.

How would you do this with the new proposal?

solnic commented 8 years ago

You want a changeset concept, like in Elixir's Ecto. You can achieve all the functionality that you want without having mutable entities.

The example with password encryption is a classic example of mixing responsibilities. Your users don't have passwords, they have encrypted password digests. Making them responsible for encryption clutters their interface. You only want to do the encryption when you persist the data. Externalizing that behavior is actually more reusable than keeping that in the entity. ie you can have a password data type that receives a string and returns a digest, and re-use it easily. Authentication should also be external to the entity, where the logic required to match a password is encapsulated, for the same reason - better reusability and keeping entities interface simple.

pascalbetz commented 8 years ago

@Bounga

You can compare two entities to check for the diff. It is what Rails does, only there the "previous" and "current" state is in the same entity.

@Solnic I agree. That would be a nice "side-effect" of read only. If they are read-only then they are also dumb. All business logic would then live in services or whatever. Having a custom type for passwords is interesting but not sure of it would be easy to debug/understand (looks like a string but behaves differently)

solnic commented 8 years ago

@Solnic I agree. That would be a nice "side-effect" of read only. If they are read-only then they are also dumb. All business logic would then live in services or whatever.

Yes, having pure data objects in your system as the boundaries is awesome. That's how I've been using ROM and it's a great approach.

Having a custom type for passwords is interesting but not sure of it would be easy to debug/understand (looks like a string but behaves differently)

Uhm, sorry I didn't explain this properly. I meant a custom data type constructor, the primitive is still a string, but the constructor coerces input into a digest. This is what I'm doing with dry-data types, and it's pretty awesome, easy to encode app-specific types and reuse them.

accuser commented 8 years ago

@solnic that make sense - thanks for taking the time to explain the pure data approach. I guess there are still some railsism to shake out of me... ;-)

class PasswordCoercer < Hanami::Model::Coercer
  def self.load(value)
    BCrypt::Password.new(value)
  end

  def self.dump(value)
    BCrypt::Password.create(value)
  end
end

I'll have one immutable entity to go, please. :+1:

hlegius commented 8 years ago

If they are read-only then they are also dumb. All business logic would then live in services or whatever. (..)

I think that's the most challenging part of all these proposals.

When @jodosha came with his ideas, my first thought was Use Case driven design (Hexagonal Architecture, whatever). There, biz-rules 'lives' inside Interactors and Use Cases (a.k.a Application Services) which deals with the requests (Controllers, for example).

It would be a great improvement, but we need to carefully introduce all those concepts in a way everything works well connected (Application Layer with Domain-Object Layer mapping to Infrastructure Layer, etc.)

solnic commented 8 years ago

@hlegius what kind of concepts should be introduced? here it's a simple question: do you want to model your domain using mutable objects, or not? :)

hlegius commented 8 years ago

@solnic

what kind of concepts should be introduced?

Concepts like CQRS and how they collaborate between each other and with external (inner/outer) layers. Those patterns and architecture should be well developed/connected into the framework, otherwise, people won't have a reason to adopt.

here it's a simple question: do you want to model your domain using mutable objects, or not? :)

IMO, both mutability and immutability have its uses, pros and cons. I am not convinced yet that having a domain-model hundred percent immutable (functional programming, anyone?) would bring more pros than cons or/and could represent domain-model in a richness-level in an Object-Oriented Language.

I've been following your projects for a while and saw how you've evolved them. That said, I'm sure that you have great points to defend immutability within domain-model.

For this proposal, keeping Entities immutable have two good reasons: simplicity and isolation from persistence layer. As I write/describe Immutable Entities, they look like DTOs to me.

My only concern stays in how we'll communicate and describe how to use it properly within Hanami ecosystem.

So, I vote for :+1:

hlegius commented 8 years ago

@jodosha If you don't mind, I'd like to exercise a bit:

The idea is to turn Entities Immutable and allow Repositories to work with Hashes. So, if I need to update a record after some biz-rules got applied, will I need to extract the diff between I've fetch from database (as Entity) and what I've changed, like this:

# Changing Book release version
book = BookRepository.find(123456)
book.update_version('v1.9 2016-02-02')

# somehow
BookRepository.update(book.id, book.diff)

# Book class just to explain in detail
class Book
  def update_version(new_version)
    Book.new(@name, @title, @author, ..., new_version)
  end
end

# IMO, this is totally impractical in a fw perspective, but I needed to list this option as well

Or, I will only use Entities as types and nothing else, like this:

# Update book version
BookRepository.update(params[:book][:id], params[:book][:version])

updated_book = BookRepository.find(params[:book][:id])

The first idea is like Gary Bernhardt explained in his speech years ago about his FauxO (impractical as himself described). The second, something more functional with data and behaviour living separated.

So, going through those concepts, could we maybe lose some Object-Oriented key-points like 'passing/responding to a message' ?

jodosha commented 8 years ago

@accuser What about this?

class User
  include Hanami::Entity
  attributes :password_digest

  def initialize(attributes = {})
    super attributes.merge(
      password_digest: BCrypt::Password.create(attributes[:password])
    )
  end

  def password
    BCrypt::Password.new(password_digest)
  end
end

This example above is NOT about SRP and which component should provide encrypting/auth. It just to show that is possible to avoid mutations even we don't recognize this at first glance.

jodosha commented 8 years ago

@hlegius Why you would go for that first example while the second works better? What I would do is:

class Update
  include Hanami::Action
    # PATCH /books/1
    def call(params)
      book = BookRepository.new.update(params[:id], params[:book])
      redirect_to routes.book_url(id: book.permalink)
    end
  end
end

It accepts the identifier and the data and returns a new Book instance with the updated data.

The example above is a "coarse grained" example where we're able to update all the data. Let's imagine how we can just update the version:

class BookRepository
  include Hanami::Repository

  def update_version(id, version)
    update(id, version: version)
  end
end

Similar to the one above. That method accepts identifier and data and returns the Book instance with the updated version.

@hlegius Is this interface practical for you?

accuser commented 8 years ago

@jodosha I actually prefer the PasswordCoercer above (I might have got the base class wrong though) as the concern is completely removed from the entity, and is reusable.

I am aware that BCrypt::Password.new has a very real cost associated with it, and the cumulative impact of coercing a password attribute in this way should be mitigated. I realise that your suggested approach does this, but then the entity contains code that it is not really concerned with.

This did bring up three other questions for me though:

1) When retrieving an entity, can I control the fields / attributes that are included? For example, I only need the password attribute for authentication; can I ignore it otherwise? Mongoid does this very well (or at least, in my opinion!)

2) Would an identity map be useful? If the record becomes the reference point as opposed to the model (insofar as the model is a readonly mapping of the underlying data) would this help to mitigate the double loading issue? i.e. to see what I am about to change I need to load the data, change it, and then load it again.

3) If the entity is immutable, all attributes, etc. are immutable. Could Hamster be useful here?

hlegius commented 8 years ago

@jodosha this approach drives us to something more functional. There is no Object concept there. Just function calls over some data. I am afraid in lost OOD 'power' with immutability.

Here is what I am thinking about for a while and would like to share these thoughts with you:

1 - Mutability dependency on the Repository: can't detach biz-logic from repository because all entities are immutable. All mutable actions are just repository methods. So, data and behavior will live apart and coupled to each other (Service Objects, anyone?)

Let's consider this small example:

class Book
  attr_reader :releases

  def initialize(title)
    @title = title
  end

  def release_new_version(version)
    can_release?(version)
    attach_new_version(BookRelease.new(self, version))
  end

  def out_of_catalog
    # Farewell book!
    @status = :out_of_catalog
  end

  def out_of_catalog?
    # ...
  end

  private
  def attach_new_version(release)
    @releases << release
  end

  def can_release?(version)
    fail ArgumentError, 'You cannot release a new version...' if current_version > version
    fail RuntimeError, 'This book is out of catalog' if out_of_catalog?
  end
end

class BookRelease
  def initialize(book, version)
    @book = book
    @version = version
  end
end

This silly example shows a bit the problem that I've seen. Here we have: Aggregate Root, Value Object and Entity collaborating and encapsulating its own concepts, data and behavior without even talking about other layers. I can't see the same with immutable stuff.

We can split them into Service Objects, Interactors and so on and maybe it can work nice and soft, but, it'll be data and functions separated from each other, hard to split them as biz-rules and the world (web adapter, Input Validations, Servicing Integration, Persistence, etc).

Don't get me wrong: I do totally agree to not couple database with Entities. All those class methods has_many, has_one, belongs_to with Lazy Loading is a real problem: coupling (Domain/Persistence layers) and introduce lots of complexity unnecessarily, IMO.

I believe if we can introduce Identity Map to Entities, letting them be mutable and within an Identity Map/Unit of Work level, we can perform a 'diff' between that we have and what got modified to persist them. Associations we can keep with that .preload idea all by Eager Loading. book_repository.preload(:author).find(1) or via Query/Command helping making implicit concepts Explicit

2 - We're trying to simplify our life (Hanami devs), but maybe even a bit, this simplicity could turn Hanami users life complicated by leaving to them deal with complex architectural decisions; At the same time, blocking some other architectural decisions - like Boundaries Contexts, Namespace Segregation, etc.

This approach remember me of CQRS.

Fowler's quote:

In particular CQRS should only be used on specific portions of a system (a BoundedContext in DDD lingo) and not the system as a whole (.. ) So far I see benefits in two directions. Firstly is that a few complex domains may be easier to tackle by using CQRS. I must stress, however, that such suitability for CQRS is very much the minority case.

The point I'd like to left here is: should Hanami just wrap some Sequel's abstractions or solve some complexities between domain/infra layers through some well-defined API?

I am afraid that maybe we're simplifying too much the client's software complexity, forcing them to follow a defined structure to simplify Hanami Model's implementation.

solnic commented 8 years ago

@jodosha this approach drives us to something more functional. There is no Object concept there. Just function calls over some data. I am afraid in lost OOD 'power' with immutability.

"Object concept" can be anything, an object that doesn't mutate is still an object, an object that performs some work based on data it received is still an object. It is more functional but it's still OO. I can also guarantee you that with this type of objects encapsulation and reusability is actually better than with the "classic" approach of mixing data and behavior. I've been doing this for a couple of years already and it's much better than anything else I tried before.

I believe if we can introduce Identity Map to Entities, letting them be mutable and within an Identity Map/Unit of Work level, we can perform a 'diff' between that we have and what got modified to persist them

It's way easier to load data into memory from the db and diff it with the input already validated and converted to a persistable structure. Why would you load data into objects, allow them to mutate and then do the extremely hard and brittle work of state-tracking when it really boils down to loading a bunch of hashes and comparing them with what you received?

Let's consider this small example: (...)

A couple of questions:

1) why would you allow a book to change its status? what if something changes its status, unexpectedly? 2) why would you allow a book to change its own releases? what if something adds the same release twice, unexpectedly? 3) why a book knows about its own releases, anyway? 4) why would you even create this abstraction where the only thing you care about is data validation and then persisting the data?

By making this object mutable you add the possibility of having bugs that you wouldn't have if this was a simple data structure w/o mutable state and something else would use it to make business decisions (includes validation of course). Also notice that this simple struct could expose additional interface to encapsulate certain logic (ie. Book#valid_version?(version) if that's what you need to have in your system).

hlegius commented 8 years ago

@solnic

I can also guarantee you that with this type of objects encapsulation and reusability is actually better than with the "classic" approach of mixing data and behaviour together. I've been doing this for a couple of years already and it's much better than anything else I tried before.

It's hard to me to see more pros than cons with this approach in a lang like Ruby. I was wondering how can I describe a concept/idea keeping data + behaviour. This approach remembers old Services objects: class methods whom receive scalar structures, perform something and return something else instead of passing messages through objects and its collaborators.

When dealing with OO, I really miss the 'here is the concept called Car' there you can find what it can do or collaborate with and also, its data.

Do you mind to give an example in how you usually achieve encapsulation/reusability doing this?

Why would you load data into objects, allow them to mutate and then do the extremely hard and brittle work of state-tracking when it really boils down to loading a bunch of hashes and comparing them with what you received?

To encapsulate all these state-tracking within the framework, letting the Client (framework's User) detached/unaware of this complexity, allowing him to lose his time thinking in how to solve his biz-rules problems. I expect to found in the 'Book' class everything about how any Book behaves in the application, what messages its changes with outside or what inners it has (like Release as Composition Object)

1) why would you allow a book to change its status? what if something changes its status, unexpectedly?

There should be a contract between Book and its collaborators. If something chances its status will be through Book's instance methods by giving the correct messages to it (duck typing)

2) why would you allow a book to change its own releases? what if something adds the same release twice, unexpectedly?

In that example, I assumed that Book is a Root of its own Aggregate Root. Release is like a Composition: all Release's Object Lifecycle is upon Book object lifecycle. If I destroy a Book, all Releases objects will be destroyed as well. Release is a Value Object - immutable. All biz-rules around how a new Release would be created relies on Book and/or Release object.

3) why a book knows about its own releases, anyway?

In that example I was assuming that Books and Releases should be an aggregate. It will varies of course, based on what the application is required to be done.

4) why would you even create this abstraction where the only thing you care about is data validation and then persisting the data?

Actually, I described that example to illustrate an idea where there are object-oriented principles applied on objects (composition, aggregation, association) and how objects can deal with its collaborator lifecycle, etc. Business rules validations is just a detail there, to give more context in what I was exemplifying. Data validation should be done in an outer layer (Application Layer, maybe).

To resume my concern: I wouldn't like to lost the abstraction and all Object-Oriented meaning/designing/sketching. If there is a way to keep it even with immutability, then, for me everything can be nice and soft :+1:

By making this object mutable you add the possibility of having bugs that you wouldn't have if this was a simple data structure w/o mutable state and something else would use it to make business decisions (includes validation of course)

That's the root of all my concerns: something else would use it to make business how something else would use an immutable object to perform some biz-decisions? Will I need to have Use Case Objects to perform my biz-rules using my Entity as Value Object? In this approach, I can't see any collaboration between objects, just scalar/hash values in and out through stateless classes.

(..) encapsulate certain logic (ie. Book#valid_version?(version) if that's what you need to have in your system).

Where would I place non-boolean/complex logics? Will I need to create a CreateNewBookVersion Service Object to receive a Book Object to do something with it and call Book or Release Repository to persist data? Those ideas that I am missing with immutability and I consider them core to any object-oriented designed software.

I see it as a very specific solution being applied in a general end-purpose framework.

I do not want to be rude (actually, I am trying to understanding your point and arguing where I think it's appropriate). It seems like Aspect-Oriented Programming: has its specifics uses, not framework-core architecture. I am seeing it as: its tough to deal with Data Mapper, so, lets be simplistic and save our time. As downside, the users will be forced to follow those ideas even if for their needs it will be more harmful than otherwise. Immutability/Mutability should be a client design's decision (in a OO Lang, at least).

Btw, there is a long and great discussion about it on SO.

jodosha commented 8 years ago

@hlegius

I am afraid that maybe we're simplifying too much the client's software complexity, forcing them to follow a defined structure to simplify Hanami Model's implementation.

This isn't our goal. I started this discussion because we can improve the quality of software and speed with these changes. Of course, this does NOT means we're going to implement it, we're still evaluating this decision.

pascalbetz commented 8 years ago

it should not be about making the framework simpler but the code that uses the framework.

jodosha commented 8 years ago

Okay, to be clear, when I say the quality of software, I'm talking about projects using Hanami, not Hanami::Model internals :wink:

Nerian commented 8 years ago

I wonder, are these two options mutually exclusive? I mean, can we implement both strategies in Hanami and let the user decide which one to use in a case by case?

I am thinking that the user could simply include one module or another, and it sets the entity as mutable or inmutable.

This way the developer makes the decision, for its specific project, that he knows much better than we do and thereby can measure what level of abstraction he wants to initially adopt.

Thoughts?

jodosha commented 8 years ago

@Nerian That could be another option. In that case, I'd introduce Hanami::Value and leave entity as it is.

pascalbetz commented 8 years ago

@Nerian i think that would be confusing as these alternatives are really different. Every sample code, tutorial, test, answer in the forum would need to specify what type of entities it is. Besides maintaining those two different aspects could be complicated.

accuser commented 8 years ago

@pascalbetz :+1:

Hanami::Model is optional: I can remove it as a dependency, and replace it with any number of alternatives. This is a Good Thing. As a developer using Rails, I had abandoned ActiveRecord in favour of Mongoid. I was able to do this because ActiveRecord was optional, and the Rails developers had gone out of their way to ensure this was the case. The Hanami developers have an opportunity to provide an alternative approach, and if the argument for its use is cogent, it will be used.

solnic commented 8 years ago

I believe it's a decision that will have serious consequences wrt development effort that will go into Hanami::Model. Managing mutable state of entities is going to complicate Hanami::Model significantly and it's gonna take you literally years to get it right. The worse part is that you will never get it 100% right to make it work for all possible use cases, thus simpler, lower level interfaces will always have to be exposed to make it possible to use it in less "standard" scenarios, which defeats the purpose (for me at least) of such an effort. That's why I took a different approach with ROM, no need to reinvent a wheel that doesn't roll so well ;) I recommend looking into source code of Data Mappers in other languages, like Hibernate, SQL Alchemy or Doctrine. These are gigantic frameworks...

hanami / model