Refactor documentation and add MySql config examples

shayhatsor commented 8 years ago

It seems like a good time to update the documentation to include MySql. While doing that, I want to introduce a new top level section name Configuration. Current configuration documentation is scattered across sections and mixes theory, implementation details and configuration options. The new section should be the only one that contains sample xmls and provide an easy to follow enumeration of configuration options. @jthelin, @gabikliot, @sergeybykov, @veikkoeeva what do you think ? PS - do you know of a good editor of md files?

jthelin commented 8 years ago

Sounds like a good plan to me.

I like MarkDownPad 2 for editing .md files, although it does not recognize all the GitHub Flavored Markdown. http://www.markdownpad.com/

gabikliot commented 8 years ago

Great idea. We indeed need to update the documentation. Having all Configuration in one place would help a lot.

veikkoeeva commented 8 years ago

@shayhatsor Makes perfect sense. I use VS Code to edit Markdown files these days as I've it installed anyway. Markdownpad just crashes on my computer.

As a tagential note, I know there is also at least one link that pointed to a file in the repo, but currently doesn't point anywhere.

sergeybykov commented 8 years ago

:+1:

veikkoeeva commented 8 years ago

@shayhatsor There are inaccuracies at http://dotnet.github.io/orleans/Runtime-Implementation-Details/Relational-Storage. Only SQL Server mentioned, links not pointing anywhere and there are more tests (albeit using a SQL Server specific mdf file, I understood).

With regard to https://github.com/dotnet/orleans/pull/1360#issuecomment-177473230, I wonder if a mention of those should be added or an issue recorded. I also wonder if I should add a few lines on design rationale, especially why load scripts from the DB and what it would allow one to achieve.

shayhatsor commented 8 years ago

@veikkoeeva, i know. that's why i opened this issue, it started just for adding MySql. but after seeing the state of the documentation I've decided to refactor the whole thing. When I consider the documentation readable, I'll add the MySQL documentation. It's progressing pretty nicely, I've cleaned up many redundancies and duplications, added important info and restructured the menu. My goal is to allow a person(manager,programmer,configurator, etc.) who's interested in some aspect of Orleans to be able to find the info quickly. I remember how hard it was for me to get the whole picture.

shayhatsor commented 8 years ago

@veikkoeeva, Obviously, I'll need your help with the relational storage documentation.

veikkoeeva commented 8 years ago

@shayhatsor I think I get some notes prepared till Friday (maybe Thursday or Wednesday evening). I'll add them here, you'll decide then what, if anything, to include.

shayhatsor commented 8 years ago

@veikkoeeva, thanks. but no need to hurry. If you look at the menu, I'm going top to bottom, so now I'm just before Grain Persistence which probably needs work after #1060. Also, I'm also waiting for #1359 to be reviewed. In that matter, If you can have a look, it'd be great.

richorama commented 8 years ago

@jthelin BTW, VS Code does a good job of markdown editing (and supports GH markdown) :¬)

And it's free & xplat!

https://code.visualstudio.com/

gabikliot commented 8 years ago

I have a question regarding documentation structure: We now have the side menu. Should that side menu be the full list of all docs, basically serving as the one and only, full, table of content? It looks like @shayhatsor is moving into this direction, as for example he removed the http://dotnet.github.io/orleans/Runtime-Implementation-Details/ page and listed all docs from that page in the menu under "Advanced Documentation". But there are still some inconsistencies: http://dotnet.github.io/orleans/Advanced-Concepts/ has some docs that are not listed in the menu.

I am actually NOT sure we want the side menu to be the full and only table of content list. It will get VERY long, as we add more and more topics/pages. An alternative is to still have intermediate pages with links to sub pages (like Advanced Documentation and Runtime-Implementation-Details) and the side menu will only link to those, plus of course all the "prio-one" topics in the head of the hierarchy, like all we have now under Programming Guide and Step by Step. Among the other topics, if some topic is more important, we will move it out of the intermediate page and link in the menu directly and also put it in the folder structure under main and not under intermediate page.

Lets agree on the strategy.

shayhatsor commented 8 years ago

@gabikliot, I know what you're saying. It may seem like I'm going to the direction of a side menu containing everything, but I'm definitely not. you commented:

But there are still some inconsistencies: http://dotnet.github.io/orleans/Advanced-Concepts/ has some docs that are not listed in the menu.

My first goal was to move the important things from the intermediate pages to the main menu. I intentionally kept Advanced-Concepts "submenu", for the same reason in your comment:

I am actually NOT sure we want the side menu to be the full and only table of content list. It will get VERY long, as we add more and more topics/pages.

I think we need a tree, a real tree and not use pages as submenus. So you'll have a plus sign on the left of Advanced-Concepts. I thought about raising this issue when I get to the configuration section. I think it'd be best if we do something like this

gabikliot commented 8 years ago

Yes, I think if we can mimic the whole http://dotnet.github.io/, it would be good! With hierarchical sub-menus on the side in the http://dotnet.github.io/docs and with http://dotnet.github.io/getting-started/.

sergeybykov commented 8 years ago

@gabikliot I agree, looks like a good template to follow. I'm not sure how much work it entails. But since their docs are OSS too, should be easy to figure that out.

veikkoeeva commented 8 years ago

@shayhatsor Here's some notes I jotted down on the principles. Thus far it has been pretty much in my head only or buried to various issues. Maybe this can be included in the documentation refactoring somehow, but at this time it's your call, I think.

The primary design principles of relational backend for Orleans have been:

Allow use of any backend that has ADO.NET provider (1).
Allow to tune the database structure and queries as appropriate, even if the silos are running (2).
Allow one to make use of vendor and version specific abilities (3).
No assumptions on what tools, libraries or deployment processes are used in organizations (4).
Taking into account the previous points, make both porting scripts for new backends and modifying already deployed backend scripts as transparent as possible.
Use the minimum set needed of interface functionality to load the ADO.NET libraries and functionality.

At larger context, the design principles were influenced by the idea that organizations have existing software assets and data in relational storage and Orleans likely share hardware and deployments with other applications. Even with specific compliance requirements and specific performance needs, it should be possible to integrate Orleans to the mix without doing a custom build of the binaries or excessive evaluation (i.e. leafing through code) of framework specific parts on relational storage. It should be noted these principles are framework specific that pertain to Orleans and are not visible in code. Application specific functionality should use whatever methods, libraries and processes are appropriate (they can also use the same interfaces).

On known issues, the relational backend does not yet provide a backend for streams, but there is an issue recorded about it at Add Streams on top of relational database. One issue is also recorded to refactor current relational storage provider to easier setup and more general purpose. A notable idea in both of these cases is that one should be able to rely on Orleans to just work by default, but should a need arise, allow one to tune specific queries to a great extent. Besides state provider, one anticipated need could be in event sourcing where the default implementation could INSERT (without index seek or scan) to the table by default, but allow for deployment specific tuning by diverting some INSERTs and corresponding SELECT operations to specific tables. The projections and aggregates could be from views kept up-to-date on every insert or update via periodic, database specific jobs in specific intervals (say, to update projections). Also, some parts of the database data layout should be refactored, such as statistics and maybe on how some operational data is organized (suspecting data etc.). Currently there is no method to reload scripts from the database without restarting a silo.

The result is what is described at http://dotnet.github.io/orleans/Runtime-Implementation-Details/Relational-Storage.

(1) Some providers are listed at https://msdn.microsoft.com/en-us/library/dd363565.aspx. With some overlap over the first one, another list could be picked at The Connection Strings Reference, taking Teradata as one ADO.NET provider not listed in the first one. There are yet more providers.

(2) One example what tuning could mean in one context: https://blogs.msdn.microsoft.com/bharry/2016/02/06/a-bit-more-on-the-feb-3-and-4-incidents/

(3) Such as UPSERT in PostgreSQL 9.5 and newer or modifications such as PipelineDb. Or sequence operations on newer databases or row-level security in SQL Server 2014 and newer.

(4) For instance, Red Gate or Dacpac.

<edit: Some more explicit observations as to why queries in the DB and why not EF or nHibernate, for instance. As noted in the bharry post, one likely needs to tune queries differently during the lifetime of the system depending on how much data there is in the database and take into account the deployment environment. It looks like it is not possible to write a general enough query using LINQ-to-EF, for instance, and it would look like being cumbersome to compile a new set of private binaries. There are also other options available, depending on one's vendor, such as partitioning and dividing data to disks or using column storage (statistics?) or using native compilation. on newer versions of SQL Server. As Orleans is a framework that may change and cannot make decision on optimal layout for all users, it makes sense to provide defaults and let them be overridable by the users.

Then on large scale deployment updating ORM libraries might have dire consequences, as alluded by the bharry link, as the update might produce different kinds of queries that bring down the whole system. The usual way (at least that I'm aware of) is to insert data into the DB using stored procedures and query it out by views or some other method to have an explicit way to alter queries on-the-fly and even (unit) test them. The issue is that not all databases have stored procedures, but the queries could be written to use them if needed (and the MySQL provider uses them).

sergeybykov commented 7 years ago

I think this has been resolved.

dotnet / orleans

Refactor documentation and add MySql config examples #1264