michaellperry / jinaga

Universal web back-end, offering an application-agnostic API, real-time collaboration, and conflict resolution.
http://jinaga.com
MIT License
35 stars 3 forks source link

What is wrong with the alternatives? devil's advocate #14

Closed leblancmeneses closed 9 years ago

leblancmeneses commented 9 years ago

I find this project interesting, however, what is wrong with the alternatives? I am finding it hard to adopt with at least 80% of the problem domain solved by our current system.

For "historical modeling", I can track all property changes on all my entities and store them as part of savechanges as a changeset. I have history so in theory I could build j.watch to rewind time and replay. Granted this approach relies on anyone writing to sql tables to use my ef context with revisionhistory behavior attached. If I provide a rest endpoint that should be a safe assumption. Another approach is to use journaling in queues.

For "conflict detection", I can implement optimistic concurrency with sql server rowversion. 1st one wins. If the user receives 409 Conflict then the client app retrieves the most current version and helps the user merge changes.

Intention-capturing messages is accomplished with modifiedby and createdby. User or automated system has a subject claim. I have found that knowing the user and resource is enough to know the intent.

On a day to day basis REST services gives me one single entry CRUD capability. With 1 action filter I could notify all connected clients with signalr of my changes for any resource. (provides the live query for current active session)

The wins I see in the framework

I really like j.watch and can see myself using it to rebuild materialized views. I also like queues for disconnected clients and offline data storage. Today we use indexdb caching for specific endpoint get requests.

I am wondering if I can retrofit my current system to finish the remaining 20%.

michaellperry commented 9 years ago

I knew I could count on you to raise these questions.

The short answer is that historical modeling is a set of primitives with which you can construct a correct distributed systems. They are axioms from which you can create theorems.

The other things that you mention, on the other hand, are mechanisms with which you can build distributed systems. Building a machine is very different from constructing a theorem. You never know when you've covered all of the cases. And when you find a new hole, you have to add a new mechanism.

You've already seen all the primitives that are in this system: facts and queries. From these you can construct tables, queues, properties, conflict detection and resolution, services, etc. You don't need a separate mechanism for each.

I'll spend some time going through some of the holes that I've found in the mechanisms you list.

michaellperry commented 9 years ago

And now for the first of the long answers.

"For "historical modeling", I can track all property changes on all my entities and store them as part of savechanges as a changeset"

Historical Modeling is much more than auditing. It is the practice of modeling a system as a history of partially ordered historical facts. Auditing, on the other hand, is the practice of augmenting a state-based model with a sequential history of state changes. Historical Modeling is a fundamental shift in the way that you construct a model.

The advantages of historical modeling over simple auditing are:

Central authority

With an audited data store, you have to have one centralized authority. All changes to state are directed to this authority. To find out what happened, you query this authority. If you've ever used the term "system of record", you are implicitly talking about a central authority.

With a historical model, however, authority is distributed throughout the system. A fact is part of history from the moment that it is recorded, even if the node is disconnected at the time. You can find out what happened from any node in the system. They are all eventually consistent with one another.

Transparency

With an audited data store, you don't have full transparency of the state changes. If you are careful, you will capture who made the change, when, and what the previous state was. But even then you might miss important information about context. You might overwrite or delete contextual information in the state-based model that you don't capture in the audit.

With a historical model, every fact records user intent. Facts are the only way to change the state of the system. Therefore, if the system depends upon a piece of information, it is in the facts. There is no way to miss important information. Facts cannot be overwritten or deleted, so context is preserved.

Compare this with Event Sourcing, which has the same benefit. Then read on...

Partial order

With an audited data store, the audit log is fully ordered. You can tell precisely which change happened before which other change. While this sounds like a great benefit, it is actually a promise that is difficult to keep. In a distributed system without a centralized authority, not all changes can be ordered. Some will occur concurrently (that is, without knowledge of one another). If the audit log imposes a total order on changes, then it fails to document when two changes are concurrent.

With a historical model, the relationships between facts are explicitly captured. A fact is known to follow its predecessors. But not all facts are related to each other. When the relationship does not matter or is not expressed, then facts are allowed to be interpreted in either order. This makes it possible to see when two facts are concurrent. This is important for conflict detection. But more importantly, it reduces the constraints on the distributed system to a set that can be easily implemented.

Again, compare with Event Sourcing. The commonly accepted definitions of Event Sourcing imply a total order. Historical Modeling has the advantage of partial order, making it work better in distributed, occasionally-connected systems.

michaellperry commented 9 years ago

Continuing on with the long-winded answers to the question, what's wrong with the alternative of optimistic concurrency?

"1st one wins. If the user receives 409 Conflict then the client app retrieves the most current version and helps the user merge changes."

This strategy assumes that you can get to the server while the user is still present and in the mindset to resolve the conflict. If they are disconnected for a short time, the change they made will be queued and retried. The user might have moved on. You're dragging them back to a change they made earlier, that they believed was already taken. Do this often enough, and users loose confidence that anything they've done actually took.

Better to honor the user's decision. Capture the information that they had available when making that decision. Those are the predecessors of the fact. Then when you go back and examine history, you can clearly see what happened. Armed with complete information, you can interpret the history of decisions correctly. You never throw user input on the floor.

I guess that one wasn't so long-winded.

michaellperry commented 9 years ago

I've captured this thread in the documentation.