Closed michielbdejong closed 5 years ago
we need a way to ensure that depending code has a way to do a sequence of ResourceStore#getRepresentation + ResourceStore#setRepresentation
To be precise, we need a way to ensure that we can support conditional HTTP requests.
Doing such a sequence is one possible solution (but not necessarily the most efficient one, as discussed in the store atomicity document).
It seems that the first paragraph talks about thread-safety of each of the 5 methods, and the second paragraph talks about locks for sequences of calls to these methods. Aren't those two different things?
They are both about event-loop and thread safety actually. Even single-threaded Node.js apps do not guarantee order of execution in presence of asynchronous operations. Locks are one mechanism to provide event loop / thread safety.
The first paragraph is about the internal implementation of the methods: are they atomic/thread-safe themselves? The second paragraph talks about the atomicity of callers of these methods.
An implementation where the methods are not thread-safe sounds useless to me.
They are useful if you have an independent mechanism to take any non-atomic implementation and make it atomic, which is what LockingResourceStore
does.
So if a class cannot easily provide atomicity itself, it shouldn't bother. Another component (that has already been tested for this) will do it for them. But if classes can provide atomicity themselves, they can indicate that they do not need another atomicity mechanism. It's a win-win.
For instance, if you do getRepresentation while in another thread someone is changing the resource body from "Hola mundo" to "Hallo wereld", you could end up reading a "torn" body like "Hola mwereld".
So that will never happen if Operation
(which calls the store) requires AtomicResourceStore
instead of just ResourceStore
. Which is why it's a good idea to have this tagging interface.
That is talking about making AtomicResourceStore instances (that are already thread-safe for individual CRUD operations) into transaction-aware resource stores
No, it's taking a non-thread-safe store and making it thread-safe.
Note that the difference between calling ResourceStore#getResourceTransaction
and calling ResourceStore#locks.acquire
is just syntactic sugar. The difference between the two proposals is in releasing the lock.
The downside of using locks is that the depending code will probably expect them to be exclusive, and to ensure that until it releases the lock, it has control over the resource, i.e. its operations will not fail due to conflict and other threads will be unable to write to the resource.
A lot has been written about locking and optimistic locking, and I think the best choice here would be an optimistic lock. Imagine thousands of operations per second doing conditional PUTs on the same resource. They would all queue up to briefly obtain the lock, check their condition, and release the lock again. Even if all conditions return false, the requests would queue up. Optimistic locking feels like an odd fit here - in fact I Googled it and found a blog post that says exactly that:
Typically distributed systems are stateless, thus the pessimistic solution does not really fit. One reason for this is that most distributed systems use REST or at least HTTP for communication – which itself is already stateless. Therefore the optimistic concurrency control is the preferred approach. (https://www.novatec-gmbh.de/en/blog/managing-concurrency-in-a-distributed-restful-environment-with-spring-boot-and-angular2/)
Every ResourceStore implementation must be transaction-aware (or at least lock-aware), which is not the case for back-ends such as files.
On page 2, I think that's a nice example of where the 'everything is a ResourceStore' philosophy breaks down a little bit. If many of the components in our code get the label 'ResourceStore' on them, and some components have transaction awareness as an essential feature of their function within the code, while others do not, then that interface becomes a straitjacket. We could say that an Operation requires not just any ResourceStore, but specifically on a TransactionAwareResourceStore, to do its job.
Note that the difference between calling
ResourceStore#getResourceTransaction
and callingResourceStore#locks.acquire
is just syntactic sugar.
It's not: they serve entirely different purposes.
getResourceTransaction
is exposed externally, is designed to work across methods, and moves the locus of control for atomicity to the caller.
Note that there is strictly no ResourceStore#locks.acquire
; ResourceStore
does not expose a lock in that scenario. Rather, the exposed interface stays the same. A ResourceStore
implementation might be wrapped in a LockingResourceStore
, which is a generic component responsible for guaranteeing the atomicity within methods, for those stores that do not already have method atomicity themselves.
The downside of using locks is that the depending code will probably expect them to be exclusive
They need to be exclusive, yes.
I think the best choice here would be an optimistic lock
Agreed, but it also depends on how cheap or expensive locking is. (LockingResourceStore
allows for different ResourceLocker
implementations.)
Even if all conditions return false, the requests would queue up.
True; worst case, they can time out.
If many of the components in our code get the label 'ResourceStore' on them, and some components have transaction awareness as an essential feature of their function within the code, while others do not, then that interface becomes a straitjacket.
Hence AtomicResourceStore
.
We could say that an Operation requires not just any ResourceStore, but specifically on a TransactionAwareResourceStore, to do its job.
Indeed, as I wrote in https://github.com/RubenVerborgh/solid-server-architecture/issues/21#issuecomment-520730153.
I would vote for "Approach 1 to conditional requests: transactions", but with optimistic instead of pessimistic locking (so no need for a release
method). See also https://github.com/inrupt/wac-ldp/issues/46
Note that with optimistic locking, there are two places where a conflict might be detected: at "3. check the conditions" and at "4. if the conditions are satisfied, call the modification method", if the resource was written to since the optimistic lock was acquired. See also the 'check and set' section in https://redis.io/topics/transactions for an example of how Redis supports this nicely.
no need for arelease
method
I guess that even with optimistic watches instead of pessimistic locks, for for instance in-memory watches in a Data Access Object in front of a filesystem, it would be useful to have release
so that the list of watches doesn't grow as the server keeps running.
Having said that, if you want pessimistic locking, then we'll build it! :)
it would be useful to have
release
You need it. At some point, you have to indicate that you are done with a lock, optimistic or not.
I would vote for "Approach 1 to conditional requests: transactions", but with optimistic instead of pessimistic locking
Could you argue your choice, and specifically counter the severity of the requirements and caveats listed on https://rubenverborgh.github.io/solid-server-architecture/store-atomicity.pdf#page=2 ?
Some of those seem prohibitive.
if you want pessimistic locking
I don't. But unfortunately, Approach 1 is the only one that does not support optimistic locking, because it only lets you call createTransaction
.
And sure, we could work around that. But that makes it a different (and likely more complex) approach. And the complexity stems from the fact that another component (namely the caller) takes control over the locking mechanism and atomicity.
Note that with optimistic locking, there are two places where a conflict might be detected: at "3. check the conditions" and at "4.
Exactly, so this means that, with Approach 1, getRepresentation
would need to be called twice, so it becomes even less efficient.
Sorry, hadn't read your Slack remark yet when I wrote the above.
Preliminary conclusion: there seem to be fewest caveats with the third approach.
That works for me. Especially since it allows us to keep the whole ETags thing close together and have one instead of two places where conflict can be detected.
Doing it this way is also a nice fit with how updateResource(ResourceIdentifier, Patch)
is already inside the ResourceStore.
Thanks for all feedback @michielbdejong, updated main diagram following.
Added in 0e1cc120c8ea0da754a7a649cef7aaa4f2f27273 to https://rubenverborgh.github.io/solid-server-architecture/solid-architecture-v1-2-0.pdf
As discussed, we need a way to ensure that depending code has a way to do a sequence of ResourceStore#getRepresentation + ResourceStore#setRepresentation without the underlying resource having changed in the meantime. There are two ways to do this:
getResourceTransaction
The way this would work is:
ResourceLocker
As described in https://rubenverborgh.github.io/solid-server-architecture/store-atomicity.pdf Questions about this (from Michiel to Ruben):
It seems that the first paragraph talks about thread-safety of each of the 5 methods, and the second paragraph talks about locks for sequences of calls to these methods. Aren't those two different things?
An implementation where the methods are not thread-safe sounds useless to me. For instance, if you do getRepresentation while in another thread someone is changing the resource body from "Hola mundo" to "Hallo wereld", you could end up reading a "torn" body like "Hola mwereld". So that's different from the problem we're trying to solve with transactions for sequences, and I think that thread safety is such a basic feature that we don't need to include NonAtomicResourceStore in our model. We can just call that BuggyResourceStore. :)
That is talking about making AtomicResourceStore instances (that are already thread-safe for individual CRUD operations) into transaction-aware resource stores by adding locks that can act across sequences of such CRUD operations.