dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

Storage providers have to throw an InconsistentStateException upon a write conflict #1609

Closed Eldar1205 closed 6 years ago

Eldar1205 commented 8 years ago

A grain's WriteStateAsyncleads to GrainStateStorageBridge.WriteStateAsync which reports any failure as OrleansException, when state conflicts should be reported as InconsistentStateException, according to documentation. Additionally, the AzureTableStorage doesn't throw InconsistentStateException on state conflicts, since it delegates to GrainStateTableDataManager.Write which doesn't throw that exception. This looks like a bug in the code, not in the documentation. Do you think it should be fixed, or keep it like that since existing applications may already rely on the exceptions thrown today?

lmagyar commented 8 years ago

As I see, the other problem is that InconsistentStateException is not handled by the framework at all. See also #1420

Eldar1205 commented 8 years ago

IMO the application should be able to response first before Orleans, but Orleans should respond if the application doesn't. Think of that as convention over configuration. The fact the documented exception isn't what's thrown from the (probably) most used built-in storage provider is a bug IMO and should be handled regardless of what to do with the situation.

sergeybykov commented 8 years ago

I think the desired behavior here is for storage providers that support eTags to throw an InconsistentStateException as per the documentation. Once such an exception is thrown from WriteStateAsync, grain code has a chance to catch and handle it. If grain code allows the exception to escape, silo should deactivate the activation and deliver the exception to the caller just like any other exception thrown from a grain method call.

shayhatsor commented 8 years ago

@sergeybykov, regarding your previous comment:

If grain code allows the exception to escape, silo should deactivate the activation and deliver the exception to the caller just like any other exception thrown from a grain method call.

i think that deactivation is a good idea only in special exceptions like InconsistentStateException. In other cases, this incurs a performance penalty of deactivation,activation and read of state from backing store.

gabikliot commented 8 years ago

@shayhatsor , yes, we were talking here about behavior in the case of InconsistentStateException.

sergeybykov commented 8 years ago

@shayhatsor @gabikliot Yes, this is only for InconsistentStateException.

shayhatsor commented 8 years ago

@sergeybykov, if I understand correctly, this issue can be closed and a new issue, labeled enhancement, should be opened for deactivating stateful grains when an InconsistentStateException is thrown.

Eldar1205 commented 8 years ago

Please don't close the issue, there's still the initial argument made when I opened the issue - the Azure table storage provider doesn't thrown InconsistentStateException when it should, and instead lets the StorageException propagate. Due to this I had to write exception handling code that doesn't correspond Orleans documentation, which is what this issue was originally about.

2016-06-15 23:49 GMT+03:00 Shay Hazor notifications@github.com:

@sergeybykov https://github.com/sergeybykov, if I understand correctly, this issue can be closed and a new issue, labeled enhancement, should be opened for deactivating stateful grains when an InconsistentStateException is thrown.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dotnet/orleans/issues/1609#issuecomment-226315585, or mute the thread https://github.com/notifications/unsubscribe/AFBYt-KfRibBBTtx4uPM1uEHS8Q3udJaks5qMGVqgaJpZM4H3yB- .

shayhatsor commented 8 years ago

@Eldar1205, no problem, thanks for the explanation.

sergeybykov commented 8 years ago

I thought the same - this is a separate issue to make sure storage included providers throw InconsistentStateException. It is complementary to https://github.com/dotnet/orleans/issues/1420 which is about the runtime handling InconsistentStateException properly.

jason-bragg commented 8 years ago

I agree that the documentation and behaviors should match. I agree that Orleans specific exceptions should be thrown from storage providers to allow for consistent error handling.

However... The InconsistentStateException from the storage provider is useful, but is insufficient to protect grain developers from state corruption. They need to write error handling logic to prevent grain state corruption. We can't save them from this.

Consider a case where a grain call is made that modifies grain state then encounters an exception prior to the WriteStateAsync call? If the grain call just throws some exception (null reference?) the grain state will still be corrupted but Orleans won't kill the grain.

By documenting that the storage layer will throw a InconsistentStateException and the system will clean up the mess, we're giving the faults impression that grain developers don't have to think about grain state corruption, and that's just wrong. Grain developers must develop error handling logic to prevent grain state corruption. This fauls impression is illustrated in @Eldar1205's comment :

Due to this I had to write exception handling code that doesn't correspond Orleans documentation, which is what this issue was originally about.

Sure the documentation was wrong, but just as importantly, it gave @Eldar1205 a fauls impression of the complexities involved in developing stateful grains.

I am of the opinion that we should document a set of exceptions that storage providers should throw, but it should be the grain developers responsibility to determine what to do with these exceptions. Exceptions should be used to communicate errors, not to trigger framework behaviors.

On a related note... In general, InconsistentStateException is one error that maps roughly to etag violation, but for consistent error handling we'll need a more complete set of 'classes' of storage errors defined and documented as part of the api. Timeout? Item exists? Service not available? serialization/encoding errors? ..? We don't need to be exhaustive, and we should keep this as minimal as is practical, but some thought needs to be given to communicating errors to callers.

Eldar1205 commented 8 years ago

With the proper exceptions inheritence tree the more specalizied exceptions can be, e.g. the deeper in inheritnce tree, the more the grain developer can express error handling. With C# 6 an error code can be simpler due to catch-when clause, but polymorphysmic catch clauses and custom exception properties are still the best a library can offer. בתאריך 1 ביול' 2016 1:29,‏ "Jason Bragg" notifications@github.com כתב:

I agree that the documentation and behaviors should match. I agree that Orleans specific exceptions should be thrown from storage providers to allow for consistent error handling.

However... The InconsistentStateException from the storage provider is useful, but is insufficient to protect grain developers from state corruption. They need to write error handling logic to prevent grain state corruption. We can't save them from this.

Consider a case where a grain call is made that modifies grain state then encounters an exception prior to the WriteStateAsync call? If the grain call just throws some exception (null reference?) the grain state will still be corrupted but Orleans won't kill the grain.

By documenting that the storage layer will throw a InconsistentStateException and the system will clean up the mess, we're giving the faults impression that grain developers don't have to think about grain state corruption, and that's just wrong. Grain developers must develop error handling logic to prevent grain state corruption. This fauls impression is illustrated in @Eldar1205 https://github.com/Eldar1205's comment :

Due to this I had to write exception handling code that doesn't correspond Orleans documentation, which is what this issue was originally about.

Sure the documentation was wrong, but just as importantly, it gave @Eldar1205 https://github.com/Eldar1205 a fauls impression of the complexities involved in developing stateful grains.

I am of the opinion that we should document a set of exceptions that storage providers should throw, but it should be the grain developers responsibility to determine what to do with these exceptions. Exceptions should be used to communicate errors, not to trigger framework behaviors.

On a related note... In general, InconsistentStateException is one error that maps roughly to etag violation, but for consistent error handling we'll need a more complete set of 'classes' of storage errors defined and documented as part of the api. Timeout? Item exists? Service not available? serialization/encoding errors? ..? We don't need to be exhaustive, and we should keep this as minimal as is practical, but some thought needs to be given to communicating errors to callers.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/orleans/issues/1609#issuecomment-229807012, or mute the thread https://github.com/notifications/unsubscribe/AFBYtyWtNK8H4JohmvUnI3vicXseYMFcks5qRENegaJpZM4H3yB- .

jason-bragg commented 8 years ago

@Eldar1205, Agreed. When I suggested 'classes' of exceptions I was referring to classifications of errors, which specific storage providers could extend to provide more details.

For instance, a ServiceNotAvailable exception might be specialized by the azure storage provider by a StorageThrottledException, communicating that the service is not available because the callers are overloading it.

Code written to run against general storage providers would know only that the service is not available, but the information coming from the exception would provide more details, and code could be written to take advantage of the more specific error messages, if developers were fine with the dependency.

Eldar1205 commented 8 years ago

I assume most developers, like me, will use a single storage provider so having the option to depend on specialized storage exceptions extend the Orleans storage exceptions sounds great for extensability combined with simplicity and loose coupling.

gabikliot commented 8 years ago

@jason-bragg , I think you are confusing different things here.

The InconsistentStateException is indeed insufficient to protect grain developers from state corruption. And no one ever said it IS sufficient. State can be corrupted in multiple ways and we are NOT trying to solve here in this issue all those cases. However, there is one important inconsistency that is introduced by our runtime: multiple activations. Our strategy to deal with that is "eventually" to guarantee one activation. The faster the eventually is, the better. So throwing InconsistentStateException from the provider upon etag violation is merely our way to make the eventual be faster: the moment we understand we got 2 activation, we will deactivate one.

That does not have anything to do with all the other good reasons for which the developer has too reason about his state corruption. Its separation of concerns.

shayhatsor commented 8 years ago

I agree with @gabikliot's perspective. It could be defined by: InconsistentStateException ➡️ the grain is in an inconsistent state.

sergeybykov commented 8 years ago

I had a long discussion with @jason-bragg about it yesterday. I think I convinced him that the approach with auto-deactivation of grain that let InconsistentStateException escape from method calls has value regardless of his other legitimate concerns about managing grain state.

sergeybykov commented 7 years ago

@jason-bragg Made Azure Table storage provider adhere to the prescribed behavior. We need to update other storage providers similarly. In conjunction with #1420, we should achieve the behavior that should have been there all alone.

ReubenBond commented 6 years ago

The included storage providers (AdoNet, AzureTable/AzureBlob, Memory, and DynamoDB) all throw InconsistentStateException under at least some circumstance now and it's handled by the framework.

I'll close this and we can open it again if I've missed something.