dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

Documentation issues ("ADO.NET Persistence Rationale") #4771

Closed IliyaNovikov closed 3 years ago

IliyaNovikov commented 6 years ago

Please revise this piece as it is full of typos and hard to understand.

JillHeaden commented 6 years ago

@veikkoeeva I would like to take you up on your offer (#4625) to improve the ADO.NET documentation.

https://github.com/dotnet/orleans/blob/gh-pages/src/Documentation/Core-Features/Grain-Persistence.md#adonet-persistence-rationale-

If you would please update the technical content, I will handle the typos.

Thank you!!

veikkoeeva commented 6 years ago

@JillHeaden I've been literally in the woods for the past week and more, back now. I should have time to check this soon.

veikkoeeva commented 6 years ago

Apologies for being sluggish on this. I've been on a holiday and much of that not on the grid, so to speak.

@IliyaNovikov Is there something specific you would like to see? One thing I could imagine is just creating code for the common tasks (e.g. membership, state storage, event sourcing) and then showing the configuration to each of the providers, one amongst them being ADO.NET.

The added gist with ADO.NET is that it's not just "one provider", but there are tens of ADO.NET vendors identified by well known invariants (these are well known across ADO.NET), hence more configuration is needed. This extra configuration is:

  1. Download an ADO.NET connector library (as tabled in the current instructions).
  2. Configure Orleans to load this particular library by using the ADO.NET invariant supplied as a configuration parameter (as described in the instructions). (1)

While I see awkward sentences, repetition and some typos, these extra steps have usually been the source of most confusion. Maybe @JillHeaden has an opinion if explaining the reason for these extra steps could be useful to learning Orleans ADO.NET configuration (otherwise it's practically the same as for other providers). EF Core and ASP.NET seem to solve this by providing a named package with a named function. What these basically do is that they include the ADO.NET connector library and perhaps have some extra knobs to work with the underlying connector libraries. (2)

I'm not sure if it's helpful to copy the Orleans storage ADO.NET configuration options to the documentation instead of stating the information in some other way. What that and the rest of the documentation refers to is that ADO.NET has provider specific functionality which could be useful in some scenarios that require changing (de)serialization format of the data. It might make sense to remove these parts and add later when the features are well implemented. (3)

The documentation doesn't discuss about deploying the database. It might be useful to discuss about how one might deploy the initial database structures. There are multiple ways (f. ex.) and Orleans doesn't depend on any specific one. One way is just to open one's editor and deploy the scripts by hand.

It might be useful to mention about database filegroups (or in some blog), and database schemas.

It might be useful to add deployment option straight to configuration. The tests basically do this already, i.e. they command the database to start a database in certain size, deploy the database, run the tests and drop the database.

Anyhow, does this help you to work on the documentation, @JillHeaden?

Asides: (1): In .NET Full this loading is done by DbProviderFactory, but it was removed in .NET Core and there's a small custom class that does practically the same. This has been reintroduced for .NET Core and may be included again in .NET Core 2.2.

(2): One thought that occurs is that Orleans too could introduce named packages per vendor. A better improvement could be to introduce a lambda function to Orleans ADO.NET configuration that would be used to explicitly load some given package. The reason for this is that currently the ADO.NET invariant string is fixed to official packages, but there exists alternative ones, some considered to be better than the official packages (some examples for Oracle).

The refactoring would preserve the current interface and map the calls to a dictionary of lambdas. This touches upon https://github.com/dotnet/orleans/issues/4691 too.

(3) For instance for state provider, one might want to change from JSON to Parquet as per some information (e.g. {cluster ID, grain type, grain ID}). This has been implemented as "rough sketch" in Orleans 1.x that didn't have the IoC there is today – and I'm not entirely sure if all the facilities are there yet.

Orleans is quite tolerant for these issues and one can alter the schema and queries (basically even on the fly, though not yet implemented, say, to do things like partitioning data across several tables or even link to various storage systems such as Hadoop).

JillHeaden commented 6 years ago

@veikkoeeva First of all, here is a link to the document in the new structure: http://dotnet.github.io/orleans/Documentation/grains/grain_persistence/index.html#adonet-persistence-rationale-a-nameadonetpersistencerationalea

And here is the current wording after I made some grammatical changes...


ADO.NET Persistence Rationale

The principles for ADO.NET backed persistence storage are:

  1. Keep business critical data safe and accessible while the data, the format of data, and the code evolve.
  2. Take advantenge of vendor-specific and storage-specific functionality.

In practice, this means adhering to ADO.NET implementation goals and adding implementation logic in the ADO.NET-specific storage provider that allows the shape of the data to evolve in the storage.

In addition to the usual storage provider capabilities, the ADO.NET provider has built-in capability to:

  1. Change storage data format (such as from JSON to binary) when roundtripping the state.
  2. Shape the type to be saved or read from the storage in arbitrary ways. This helps to evolve the version state.
  3. Stream data out of the database.

Both 1. and 2. can be applied on arbitrary decision parameters, such as grain ID, grain type, and payload data.

This happens so that the developer chooses a format, such as Simple Binary Encoding (SBE), and implements IStorageDeserializer and IStorageSerializer.

The built-in (de)serializers have been built using this method. The OrleansStorageDefault(De)Serializer can be used as an example of how to implement other formats.

When the (de)serializers have been implemented, they need to be added to the StorageSerializationPicker property in AdoNetGrainStorage.

This is an implementation of IStorageSerializationPicker.

By default, StorageSerializationPicker will be used.

An example of changing data storage format or using (de)serializers can be seen here: RelationalStorageTests.

Currently there is no method to expose this to Orleans application consumption as there is no method to access the framework created AdoNetGrainStorage.


At this time, given my very limited knowledge of the subject, I don't have an opinion about steps or how to proceed from here. I can be most helpful with grammar and formatting - the technical content is beyond my current skill level in this area.

JillHeaden commented 6 years ago

Also, is this page relevant to this issue? http://dotnet.github.io/orleans/Documentation/clusters_and_clients/configuration_guide/configuring_ADO.NET_providers.html

veikkoeeva commented 6 years ago

@JillHeaden Just back from a trip to Asia and WDBE2018, will check this week. :)

veikkoeeva commented 6 years ago

I understand the text and it has all that is needed to set up a database. In that sense the instructions are functional. The other link you refer shows an example on how to configure not only an ADO.NET storage provider but also membership and reminders. As far as documentation goes, the problem is having the same information in two different places. What's better in the other link is that it shows an actual connection string. I could imagine it's better to show an actual connection string instead of a placeholder, but this varies by database. If this were a good approach, maybe put it as an example of SQL Server one (and LocalDb variety).

We could perhaps take the ADO.NET persistence rationale to another page, together with http://dotnet.github.io/orleans/Documentation/grains/grain_persistence/relational_storage.html. I'm not sure, but it's sort of advanced knowledge. It'sl also what @amccool asks at https://gitter.im/dotnet/orleans?at=5b9f708954587954f9b4b06e . Modifications are OK, even needed in some cases, and even dynamically when the system is otherwise running. As noted, there's even been a plan to surface some more common configuration.

As a general note on storage providers, here a few notes a worth considering: 1) https://github.com/dotnet/orleans/issues/1998 2) The ADO.NET fix for this isn't perfect, @ReubenBond improved it. I don't find the issue now, but certain constructs won't work and it's a problem.

Summa summarum: The documentation is better now, then some random suggestions and notes piled on it. :)

kxxreemm commented 3 years ago

Thank u

ReubenBond commented 3 years ago

I don't think this is relevant anymore. Please open a new issue referencing this one if I am mistaken.