Relational data store improvements, opinions here

dotnet / orleans

Cloud Native application framework for .NET

https://docs.microsoft.com/dotnet/orleans

MIT License

10.04k stars 2.02k forks source link

Relational data store improvements, opinions here #255

Closed veikkoeeva closed 8 years ago

veikkoeeva commented 9 years ago

Continuation to discussion at #251 about on-premises installations and a note on using relational databases. This is more like gauging issues and ideas after taking a peek at CreateTables.sql and SqlStatisticsPublisher.

Is there already a lot of SQL Server usage around? This is to see how backwards compatible possible changes would need to be.
It may be smarter to put Orleans tables to a schema of their own, say, [Orleans] instead of built-in [dbo]. Reasons are the usual about management, security, separation etc. This varies from a database to database, so there would need to be separate scripts to be run before starting Orleans (in production environment it's good if the program doesn't have DDL rights).
It may be good to provide views (in case of SQL Server, perhaps schema-bound) to access the actual tables. This would "name" the queries, perhaps help the query planner, allow enforcing further constraints via views, allow for re-arranging the tables underneath more freely etc.
In SqlStatisticsPublisher on line 148 I see the statistics are written in a loop.
- Does someone have numbers as to how many items are there usually in the list and the frequency of writes? Though, even if there were only two, at first blush it'd appear more sensible to write all the values at once instead row-by-row.
- Is someone aware of triggers in these tables? If so, changing this to write in batches may break someone's triggers.
- Why are the values ordered by name?
Would there be harm done in changing the ETag types to ROWVERSION (or perhaps DATETIME2)? This would go better with other parts of the .NET stack and is probably more what it's supposed when considering technically.
It appears just by looking at the column names in tables that they could be slimmed down and normalized more, which would in turn mean less I/O and consequently jitter etc. Perhaps even faster aggregate reads.

I would be pleased if anyone having an opinion or knowledge would dump it here. I'm currently running a test cluster on Azure, but running Orleans on-premises in a basic enterprise setting will likely mean using a relational data store. It's not that bad, it can be rather fast even for queuing -- and even a must if there are auditing policies. :)

jthelin commented 9 years ago

The status of the SQL Server support should really be considered "experimental" at best. I don't think we have any specific customers using this yet [as far as i know] so that's always the best time to make breaking changes ;)

The SQL Server feature was added by one of our colleagues in DevDiv but is something the core Orleans team is not particularly expert knowledgable about, and the work done at the time was only really intended to get to 'minimum viable product' status.

For background: The primary reason for adding the SQL Server support was because in some markets, information systems data [even just systems control data] had to stay in-country. This meant that some large customers in Canada (for example) could not try Orleans because they were not permitted to use of Azure Storage even just for stats & membership control info.

Also, one area that was left firmly left the "TODO" item from the initial was making it easier to plug-in different stats / metrics / membership providers, although in reality there will probably be a a small number of these so was/is not a high-priority -- a 'C' on the MoSCoW scale.

Given all the above comments, i think this is a great area for community members like yourself interested in using this feature to review and propose suggestions -- and there will probably be many! :)

jthelin commented 9 years ago

When looking at this area, there is an important conceptual distinction between Metrics+Membership and Stats that should be borne in mind.

The Membership & Metrics tables are "point in time" snapshots and are used for system control purposes. The main functional usage of the Membership table is cluster join status of silo nodes. The main functional usage of the Metrics table if the "load shedding" functionality in the gateway nodes, but that table also provides a small subset of system health / status information that has proved very useful for systems monitoring and diagnostic purposes. See Richard's OrleansMonitor tool for an example of this https://github.com/OrleansContrib/OrleansMonitor
The Stats "table" is an [open ended] time-series data set containing periodic counter dumps from a cluster. It is primarily intended for ongoing monitoring of system health "trends" and/or offline post mortem analysis.

The subtle engineering principle in play here is that Membership + Metrics data writing need to largely use out-of-band write mechanisms to ensure that the data always gets out [for diagnostic purposes] even if the rest of the system is currently in a "performance overload" state.

For Stats, there are fewer constraints on how to implement that, and the main requirement was/is to not create negative feedback loops if system overload caused more stats to be written which then created even more overload. Maybe that should be called something like "Heisen-Perf" phenomenon because it is really the corollary of HeisenBug :)

A Microsoft Research Intern project created a tool for analyzing the Orleans Stats data for system performance analysis and troubleshooting purposes. http://research.microsoft.com/apps/pubs/default.aspx?id=217109

veikkoeeva commented 9 years ago

@jthelin Let me follow up with more ideas later on the week. I don't have concrete use cases (customers), but I could imagine similar constraints. Of course I encourage anyone interested or having ideas to make noise.

What probably is required here, on the high level:

Possible to use various relational engines (SQL Server, MariaDB, PostgreSQL, MySQL, Oracle, DB2) and their different versions. For instance, the updateable column store indexes in SQL Server 2014 look like a good fit here, together with some windowing and analytics functions introduced earlier already. I actually see an opportunity to collect test run data on SQL Server for further analysis and even smart analysis queries flowing back to Orleans project to be used as additional views, but I digress.
From the system perspective, is SqlStatisticsProvider the sole interface to database?
To future-proof any changes, it feels like supporting only view for reading data out is the right choice here. For getting in, there are choices. Stored procedure with table value function parameter or perhaps a staging table, from which data is then loaded in in addition to dropping data directly into tables. What assumptions can be made? E.g. will deploymentId be always known in the DB before statistics related to it are stored? Plus the things @jthelin already mentioned.
How to support adding more data points in addition to CPU and memory if needed.

Hmm, did I already mention all that I was supposed to think the next week. Oh well... Have an opinion people.

veikkoeeva commented 9 years ago

In the following some reasoning on changes and further, more specific questions. This may sound lazy, but it would be great if someone could write a short description of each of the tables, their usage patterns and some table columns (comments in the sql file would probably be perfect). For instance, it’s probably figurable what’s the exact purpose of, say, Address and HostName in OrleansMemberShipTable and I could reverse engineer the usage patterns by running this for a while, but I’d rather spend the time otherwise and make just a light initial run and then more inspection after refactoring. I think the purpose of the tables and columns would need to be documented somewhere in any event.

<edit 2015-03-29T10:04:48 I see there are a few notes at Wiki under Runtime monitoring.

<edit 2015-03-10T11:24:48 In Cluster Management section there's information regarding to what corresponds to columns [SuspectingSilos] and [SuspectingTimes] (maybe these should be refactored to tables of their own?) in CreateTables.sql and others.

I think that to read Orleans system information plain SQL views should be provided. There are two aspects here: taking advantage of vendor specific database technologies and shielding database customers from them. Database customers are Orleans and then other systems consuming Orleans data, which can be ad-hoc queries or other systems. I would consider other consumers being an important part here, for instance, if there is a stable interface, further new queries could be defined and used, perhaps eventually contributed to Orleans project.

Views could be seen as interfaces or facades in C# terms or header files in C++ terms. This would allow of refactoring the tables underneath the views and more importantly, take advantage of special database features transparently. Database engines have strong and weak points and it allows the implementation to capitalize on the strong points, most bang for the licensing bucks, so to speak. This is especially true with regard to physical storage handling.

With regard to code and data consumption, to readers it is important to fix the name of the view, column order, column names and column types. Orleans would codify the contract in view definitions. This should allow further views written against these views, say analytics ones (I may write one or a few in the process). For code it would be important the types map cleanly to .NET types from relational databases. Database vendors provide improvements to vanilla views too, such as indexed views in SQL Server or materialized views in Oracle. Should there a desire to leverage these constructs, it is possible.

First step would be to provide the interfaces and convert SqlStatisticsPublisher and SqlMemberShipTable to use them. This would show so that the select clauses would be made with view names and joins shouldn’t be necessary anymore when calling from the code. These engine specific table creation clauses could be defined like _OrleansSqlServerv11.sql (or then there could be vendor specific scripts and version detection logic in them).
Changing some existing column types.
- DATETIME is a mostly a legacy type and preference should be DATETIME2, which supports up to 100 ns accuracy (by default) and maps to .NET DateTime type. This data type is also widely supported across datebase vendors. See a chart here.
- NTEXT is deprecated and will be removed and hence should be replaced with NVARCHAR(MAX). What data do the columns SuspectingSilos and SuspectingTimes hold? It may be smart to refactor this data to other tables.
- Etags are opaque, so it would make sense to use DATETIME2 here too. Other plausible option is ROWVERSION, which is SQL Server specific and would necessitate a conversion on view.
- As a note, maybe some of the DATETIME2 columns could be filled by the database engine upon insert, such as StartTime in OrleansMembershipTable, assuming it's purpose is being an immutable record when a silo has started. Notable here is that thinking it like this implies data is only created, not updated and thus history destroyed and this could necessitate table refactoring (which may be necessary in any event). I'd think saving history is important regarding analysis and perhaps one *principal decision to make is if data is ever destroyed (e.g. should there ever be UPDATE or DELETE)?
- Primary keys. Taking SQL Server as an example, currently clustered indexes are created by default to the combined keys which are very fat. I don’t see a purpose here and they do look like being rather counterproductive. Clustered indexes should be static, narrow, unique and strictly monotonically increasing. For further reading Introduction to Clustered Indexes and Heaps. This would imply numeric surrogate keys, which in SQL Server case could be either IDENTITY ones or taking SQL Server 2012 or newer, of SEQUENCE type. There’s a an interesting side-issue here that SEQUENCE guarantees there will be no gaps, which may, or may not matter, on certain kinds of queries (see here for further notes).
Normalisation. It may be good to go down to 5NF or 6NF. Would Orleans like to support other types of statistics besides CPU and memory load? It would feel smart, and perhaps later even provide support for it. I tried to do some searching on what I have in my mind and sure enough I found this gem of a post by PerformanceDBA. I couldn’t put it in words even half as well, so I link it here. The other extreme I see is to denormalize everything and add columns as going forwards. I’m not sure which one would work better. This would necessitate some more refactoring and could be done in separate passes as long as the views are there and people do not mind their production database tables wouldn’t have a migration path. If that is a problem, it would feel better to do all the changes in one PR.
- Denormalising that strongly would have implications on loading data to the database. Perhaps it’d be separate inserts or perhaps there should be a staging table and loading from there. Probably not a problem.
Perhaps the command parameters could/should be cached. The commands remain the same, only data changes.

Has anyone insight, opinions or further improvements? I didn't go into table design as it feels other things should be sorted first. I apologise the questions and clue are interspersed in the text, but perhaps they stand out enough.

<edit 2015-03-29T09:04:48 Some additional pointers to queries support to which might of interest:

Interval queries.

sergeybykov commented 9 years ago

@veikkoeeva Improvement to the SQL story are more than welcome. Here's something to keep in mind. There are two distinct cases here: 1) critical system tables for cluster membership and reminders; and 2) non-critical tables for statistics and metrics.

Cluster membership in particular has more subtle transactional update requirements, and in the Azure Table case we use a version row for that. As @jthelin mentioned above, we are not aware of any customer using SQL for system store in production. So we need to be extra careful with testing any changes to it, as we may or may not have sufficient test coverage for it. What is your plan with regards to testing the SQL option? Do you have a customer that you plan to use SQL for? Maybe @davidpodhola can chime in here about his customers using SQL that he mention in https://github.com/dotnet/orleans/issues/275. Some of them use SQL 2000.

We will write a page detailing the tables shortly.

I also have questions about some of your questions above. :-)

Etags are opaque, so it would make sense to use DATETIME2 here too. Other plausible option is ROWVERSION , which is SQL Server specific and would necessitate a conversion on view.

Are you suggesting to write current time as DATETIME2 and use it as an etag?

As a note, maybe some of the DATETIME2 columns could be filled by the database engine upon insert, such as StartTime in OrleansMembershipTable.

This is an interesting question considering that clocks on silos nodes can drift significantly. Maybe writing both caller's time and SQL's time would be even better, so that substantial deltas in clocks could be detected right away?

Primary keys. Taking SQL Server as an example, currently clustered indexes are created by default to the combined keys which are very fat.

This is the artefact of using Azure Table as the primary implementation. So long as we stay compatible with Azure Table at the interface level, I think it's okay to make indexes in SQL more robust.

Would Orleans like to support other types of statistics besides CPU and memory load?

Another interesting question to discuss. On the one hand, these metrics were meant to be used for things like load-aware placement algorithms. On the other hand, they duplicate system performance counters and statistics being reported in parallel. I personally see no problem with duplicating a small subset of the perf counters in the table if that makes implementation or operations easier.

veikkoeeva commented 9 years ago

@sergeybykov

What is your plan with regards to testing the SQL option?

I'm not aware of any good cross-database stand-alone tooling. One option is SSDT and it would just work on Windows machines with Visual Studio 2013 installed, including database unit tests or integration tests. You could see it as Azure Storage Emulator, except using LocalDb (SQL Server Express) that's started automatically (Storage Emulator too could be started automatically, maybe I or someone else could supply a PR for that).

If the system is used through views and especially if the main consumer is taken to be Orleans, testing with this should give a fairly good indication everything works logically on another databases too. However, it wouldn't directly test the performance profile otherwise than making it possible to see the amount of logical operations, which are a fairly good indication of performance when one has established a baseline (e.g. less logical operations means better performance). I don't see here anything complicated enough that shouldn't work in a similar fashion on, say MySQL, as in SQL Server if the tables have been defined reasonably well. Naturally SQL Server and Oracle have a better query optimizer etc. than, say, MySQL, but that's to be expected (the difference may not matter if the queries are simple, as they would be in the beginning). Performance tuning with trends and (regression) checks and so forth would require a heavier setup with the real databases installed and I think it'd be up to you guys to arrange if deemed necessary.

Other than SQL Server setup, e.g. MySQL, can be tested and when using views, tables can be even drastically altered and it would still work. In case of "drastic alteration" one part of the story is to define the input interface, which could be a wide staging table or a stored procedure.

I feel that if someone is reading directly the tables currently, the path forwards would be to make views having the same names as the tables, change the table names and things would work as they did. These views, in theory at least, do data conversions too if those are deemed necessary. If there is no other indication, this seems rather fruitless work. Orleans should continue working as it did but storage usage should be improved and enable new opportunities.

Are you suggesting to write current time as DATETIME2 and use it as an etag?

Yes.

This is an interesting question considering that clocks on silos nodes can drift significantly. Maybe writing both caller's time and SQL's time would be even better, so that substantial deltas in clocks could be detected right away?

This is a splendid idea if delta detection provides added benefits – and this would work assuming the servers and databases would have clocks synced from the same source (or sources that aren't drifting). There is a common pattern in relational design to add a column with a field such as CreatedOn or similar, that's stamped by the database engine upon row creation. It doesn't cost much storage and in many cases it's useful to know when a given row was created. Sometimes there is in addition, or just by itself, a column ModifiedOn or similar. This ModifiedOn is used basically as an ETag to do optimistic concurrency (SQL Server has ROWVERSION too for this purpose).

This is the artefact of using Azure Table as the primary implementation. So long as we stay compatible with Azure Table at the interface level, I think it's okay to make indexes in SQL more robust.

I probably would go creating surrogate keys. There's accepted wisdom and subleties here and I'm a mere programmer so I'm a bit of hand-waving currently without concrete implementation. The upside of views is that one can change and tune and the consumer would benefit. This would affect joins only if there'll be other tables. Silo names and so forth should continue to work (through views) as they did. Perhaps an obvious note: it may be that using the silo name as a primary key works better.

I'd prefer not to destroy information and, just add it, but this is more of an architectural call to make. It looks to me the trend is to store everything until explicitly removed by a scrubbing operation and if necessary, Orleans could mark the rows that can be safely removed by such an operation. As an example case: Adding to the generation counter. Maybe it would nicer to create a new line altogether and then count the lines with the same silo name to provide the generation count to interface level. This way one could retain information when was a silo created and what other information is linked to this particular silo generation. Azure table version may not have all of this, but the relational version could provide compatibility nevertheless.

Likewise to gossiping: candidates to refactoring to new tables and then assembling the currently logged string either in view or in Orleans. I'd put this one to Orleans and let the view plain data rows as other consumers of this view may want to handle this particular piece of information relationally.

As a tangential, there are immutable databases, such as Datomic.

Another interesting question to discuss. On the one hand, these metrics were meant to be used for things like load-aware placement algorithms. On the other hand, they duplicate system performance counters and statistics being reported in parallel. I personally see no problem with duplicating a small subset of the perf counters in the table if that makes implementation or operations easier.

One benefit I see is in testing. I have this lofty vision of having the ability to log a lot of data, amongst other things performance counters (custom markers?) and then, say, how a function call propagates through the pile of grains. The idea is to leverage the database engine to do performance and correctness (e.g. the right grains were called in the right order in the grain mesh) analysis. If there were views, and if I'm not mistaken, the same views could be pointed to other storage too, like Table Storage, and analyzed data from there (I refer to polybase or similar open source variants).

In any event, the first phase would be to define some new views for Orleans and alter the queries a bit to use them. Maybe in the same go alter some data types, e.g. make ETags use DATETIME2.

I'm currently, very, very occupied with other affairs, so I wouldn't be holding my breath with this one. Then again, I have a day or two to do this in the next couple of weeks and if things go well, I would think I get some tables designed and views defined and code refactored. If someone gets there first, I don't mind. :)

<edit:Relation databases have a lot of reporting tooling available, also lifting data to tools such as Excel or other auditing tools and procedures. With reference to #290.

veikkoeeva commented 9 years ago

A link to Codeplex discussion touching upon this: Orleans SQL stats table. Indeed, I haven't been looking that much into data migration if/when I'd refactor this design. One option is feature switches, I suppose.

sergeybykov commented 9 years ago

I'm not aware of any good cross-database stand-alone tooling.

I'm not concerned about breadth of testing at this point but rather about depth. The ultimate depth of testing is running a serious load in production. That's why I was asking if you have a customer for the SQL option. Looks like FrankieOrleans from Orleans SQL stats table has such plans.

I like ALL of your ideas here. Just wanted to understand how we'd test the functionality.

veikkoeeva commented 9 years ago

@sergeybykov Indeed, I don't have the possibility to do heavier, "production grade" testing currently. It looks like FrankieOrleans would like to deploy into production in two months. I'd be willing to push a bit harder on my behalf to get this solved by then.

I don't see there will be that many changes, but I feel it'd be important to lay down the basic structure. So that we wouldn't complicate Orleans code with vendor specific issues and then decide upon if data should be treated immutable (e.g. only inserts) and deleted only on special scrubbing operations. Views do that for reading, but inserting presents more choices, more about that later.

It appears the relational deployments are getting plentier by the day, so something concrete progress ought to be made soon. It would be cool if FrankieOrleans or someone else had spare time to put effort here so that the changes would become too painful to existing deployments or hamper future prospects.

Concrete plans:

Concrete views for Orleans to use. That means altering the existing queries. This could mean there will be one view per query where there is a join currently. To save in I/O, maybe some queries wouldn't need to read all of the columns in views, but only some (these could be separate views too). Straightforward hands-on tactic feels like taking the queries from code and creating corresponding views. I.e. moving the joins to views and in the code calling SELECT <columns> FROM <view> <where>.
Creating views alone would work. Simultaneously or as a next wave, enhanced table structure would be done. This would necessitate changes to insertion logic too. The more architectural decision here is that if data would be treated immutable and removed only on special scrubbing operations. I'm in favor of this. Concrete points (collecting ad-hoc ideas):
- Silo generations aren't a counter but a whole new row. Then when Orleans wants to know the generation of silo, rows would be counted (by a view). The silo tables would be normalized as needed and data would be joined back to silos and corresponding generations by a surrogate primary key.
- Re-structure gossiping to relational format (i.e. not free-text).
- Perhaps design tables so that other counters than memory and CPU can be added. I don't see this that important now if there'll be a view anyway. Naturally one could ponder that if needed, how would Orleans discover dynamically what counter data has been collected, but I don't see it as a problem.
- Choose other data types, e.g. DATETIME2 for ETags. Also paying attention to the other notes in this thread.
Insert values in batches. E.g. INSERT INTO VALUES, there's also bulkcopy and bulk insert, but bulkcopy would be slower with the amount there's to insert, I gather, and BULK INSERT is meant for other purposes.
- Cache SqlCommand classes (I assume this can be done).
Testing story. SSDT looks good to me here at least as an initial testing enabler.

About inserting:

On Codeplex discussion @gabikliot noted the insertion rates and amount are very modest, so I would assume just inserting straight to even multiple tables in one transaction would probably work well. This is what I'd like to go with first and modify later if needed as this something that affects only Orleans internally. I'll provide the following as a point of discussion.

Normally I'd like to shield insertion logic, especially more complex one, using stored procedures, but this isn't a complex application when it comes to data model and storing it (as architecture background, see here). The problem I see here that if there will be multiple tables, the insertion logic get a bit dispersed too and it'd be nice to wrap it close to the DB to ensure maximum performance. Though, as I see it, it could be handled by Orleans too.

With stored procedures one could use table type variables in case of SQL Server and Oracle and have staging tables, so to speak, automatically managed. However, this presents challenges with some other databases, as an example the very popular MySQL one. One option would be to have a permanent staging table(s), insert data there and then have a fire-and-forget call to a procedure from Orleans to transfer the data to the actual tables. This would mean there's a slight delay when the updated data could be read from the views, but this would amortize the insertion costs and reduce locking times in main tables. Yet other options are to use staging tables, but not to transfer in a fire-and-forget fashion, use triggers etc. As noted, with the insertion rates and amount of data, I don't see these as necessary. A solid model for reading is most important.

Phew! A long post. As it stands currently, I'm not sure when I get to actual, concrete matters. I would hope soon, but anything can happen. Like kids getting sick. :)

sergeybykov commented 9 years ago

@veikkoeeva I have some questions about your plan. I think I'll hold them for now because they are mostly about clarifying the details, and some of them will be answered by your PR. If not, they'll be more concrete pointing to your code.

veikkoeeva commented 9 years ago

@sergeybykov Sure. I'm a bit late with this, but I'll try to have a PR today or tomorrow the latest to get the ball rolling. It may not have much, but something to start with, a spike. Let's see, if you are in Seattle, it's around one o'lock at night there, so there's a bit of a lag.

I took a better look at the testing and I think there's no need for SSDT. Without writing code it looks like it would be possible to get the same tests running as in SiloInstanceTableManagerTests has currently for Table Storage. I would need to refactor also StorageTestConstants to include the database connection strings used. It is probably OK I have only one currently, something like Data Source=(LocalDb)\ProjectsV12;Initial Catalog=OrleansTest;Integrated Security=True and assume the newest version of LocalDb is installed, for instance installed with VS2013 (or should it be 2012?).

The larger context here is, I suppose, to have a some kind of a collection of connection strings and a factory method to create the concrete IMemberShipTable instantiations and then test through those interfaces. With regard to testing, some additional infrastructure work would be needed, such as setting up the storage (e.g. database), but it could be done in a likewise fashion. One thing I see here is that for a reason or another, someone setting up Orleans would like run everything in the same DB that the actual application logic uses. Like RavenDb, for instance. I remember there's a provider for RavenDb already and I wonder if the tests could be refactored to run through the same infrastructure for all the implemented providers. In that way, adding new storage would be like adding the relevant code to implement interfaces and the relevant test suite could be run.

I might add a table that has the the Orleans Schema Version to facilitate further migration efforts. In fact, perhaps it would be OK to add to this extra table information on the tables too, so they'd be always there where would look at them anyway.

As a tangential, it would feel better, less as "Azure as the main target and everything else as an afterthought" if we would refactor things like StorageTestConstants.DataConnectionString (which may become with these storage refactorigs) and perhaps it would show also in TestCategory("Azure"). Even though it's the history of the project, some might get entangled on it.

veikkoeeva commented 9 years ago

@sergeybykov, @jthelin, @gabikliot and others, what would you suggest as the testing story with relational provider – and all others that might follow?

I dove a bit deeper and noticed that SiloInstanceTableManagerTests could in practice be used to test all built-in storage providers, but currently on line 87 OrleansSiloInstanceManager is instantiated and it provides only Table Storage related functionality. This would imply refactoring these class names or generalizing their functionality, but at the same time I wonder if I have missed an interface or a factory function that does this already and in that case, should the tests be improved to pick storage backends more easily?

I hesitate a bit to write this, as I might have just missed something obvious, but it looks like Orleans could use a few factory functions and organizing the code a bit so that it would be clear where to add a new backend implementations (interfaces exist already) and then the test suite would pick up them automatically. As for an example case, it would be easier to add a queue adapter on top of a relational database and test it by just implementing the interfaces and sprinkling some SQL to a relevant file.

If there isn't a better story currently, I could add some relation tests (SQL Server) but the whole Table Storage suite wouldn't run on them until refactored a bit or I have gentle nudge to right direction to figure out something I've missed. :) Mind you, if indeed there's a need for a couple of factories, it perhaps wouldn't a big of a chore for someone who knows the code base better than I do and has a few hours more time than I do to put into that (or takes somewhat longer if I do something).

sergeybykov commented 9 years ago

@veikkoeeva We have 20 SQL tests that we haven't moved to GitHub yet. I'll try to move them ASAP. May still need to be refactored the way you are suggesting. @jthelin knows the SQL stuff much better but he's out this week.

You are 100% right about the need for a connection string factory. We need it even for the Azure Table case. We thought the default factory would look for a file with a storage key, so that it doesn't need to be hardcoded and can be dropped in as part of test deployment. We just haven't gotten to implementing it.

Schema versioning is total goodness. We were going to add a version column to all tables. Again, simply haven't gotten to it yet.

veikkoeeva commented 9 years ago

@sergeybykov OK. I'll wait a moment you get the SQL tests ported and see where from there. Maybe a short term option would be to get the new views, tables refactored (if only few of the data types changed) and tests running and then do the rest. Do you have an idea already how to the tests would be placed? With the current code, maybe TesterInternal/StorageTests could have one more subfolder so that there would be TesterInternal/StorageTests, TesterInternal/StorageTests/Relational, TesterInternal/StorageTests/DocumentDb etc.

Having this quick glimpse makes me wonder if OrleansSiloInstanceManager would be a suitable starting point for a new interface that also the core Orleans would use and would it be useful to be able to split statistics data to a different storage backend from the operational data? But I'm rather rookie with this codebase. :) Should the core Orleans storage structures be publicly extensible and testable? In that case, would it be a suitable longer term goal? So that one wouldn't need to add new backends to Orleans core, but could implement interfaces and perhaps run a test suite? (And Orleans core would implement storage also through these very same mechanisms.)

Then about reorganization, for instance, I'm not sure how feasible it would be move AzureUtils off from Orleans and would a more general scheme then involve more subfolders in OrleansProviders, say, one per provider. Or put another way, should other things, such as relational databases, moved to Orleans as well. Some pondering...

I was thinking a table or two that would give the database name and version loaded upon installation and then the general schema version (e.g. current would be 1.00, next one 2.00), to Orleans. Then a table or two to document the rest of the tables. If this were SQL Server only, I'd use MS_Description, but I think this method would work well too.

But yes, I'll wait for the SQL Server test project.

sergeybykov commented 9 years ago

@veikkoeeva I'll try first to move the tests as is (they are part of other tests classes and are not consolidated). We can move them around after that, or you may find that the original structure makes sense complimentary to TesterInternal/StorageTests.

Should the core Orleans storage structures be publicly extensible and testable? In that case, would it be a suitable longer term goal? So that one wouldn't need to add new backends to Orleans core, but could implement interfaces and perhaps run a test suite? (And Orleans core would implement storage also through these very same mechanisms.)

Absolutely. That was the general intent.

Then about reorganization, for instance, I'm not sure how feasible it would be move AzureUtils off from Orleans and would a more general scheme then involve more subfolders in OrleansProviders , say, one per provider. Or put another way, should other things, such as relational databases, moved to Orleans as well.

The eventual goal I think is to encapsulate technology-specific code in separate dynamically loaded assemblies. I don't think we want to move more such code into Orleans.dll. Subfolders in OrleansProviders is a good start. We can decide later if we want to split it into smaller assemblies.

Having this quick glimpse makes me wonder if OrleansSiloInstanceManager would be a suitable starting point for a new interface that also the core Orleans would use and would it be useful to be able to split statistics data to a different storage backend from the operational data?

No, OrleansSiloInstanceManager is not a suitable starting point. It should be renamed as its current name is very misleading. There are interfaces: IMembershipTable, IGatewayListProvider, IStatisticsPublisher, ISiloMetricsDataPublisher, and IClientMetricsDataPublisher that already have multiple implementations - for Azure Table, SQL, and test. I'd suggest to start with the simpler ones - statistics and metrics.

sergeybykov commented 9 years ago

Porting of the SQL tests appears to be more involved than I hoped because of their dependencies on our in-house infrastructure. Some of them share code with the Azure Table versions of the same tests, and it makes sense to move both sets of tests together. The Azure Table tests require a solution to run them with a set of private storage keys that we simply had checked in in our TFS. Necessary work to do regardless, but it's slowing me down.

I'll try to move at least a few SQL tests initially.

veikkoeeva commented 9 years ago

@sergeybykov Sounds good. If the SQL tests are involved, one option is also that I make a PR tomorrow (within 10–22 hours) that has some SQL tests. I just need a connection string that works, in the earlier note I assumed a LocalDb one as it comes with VS or can be installed separately to the CI machine – along with other DBs if necessary. I can programmatically create the database, load in the tables from the provided script, optionally load in test data and then test the DB just like one would call SqlCommand in a MSTest and check results. Or then use an interface with a concrete implementation. Here's the catch it'd be nice to run the same set to relational storage as is run to the table storage and it may not be that much of work to do that (see earlier comments, though I haven't tried).

veikkoeeva commented 9 years ago

@sergeybykov I did some work to simplify the SQL testing story and realized I'm a bit already implementing what Dapper does. Would it be all right to add a dependency on this with regards to relational story? It's all right if not. I wrote some queries to create database on the fly and then some code to cache SqlParameters etc. Just to make the flow more fluid.

<edit: I'll take that Dapper comment back. I took a glance through some of the issues and I think there are a few notables ones with regard to CoreClr, VNext and DATETIME2 support.

I had a disk crash on Saturday, so this gets a little longer. I try to submit some more barebones some day next week. I have a question, though, now that I got a few hours of thinking done...

sergeybykov commented 9 years ago

@veikkoeeva Did you see the 3 SQL Membership table tests I ported? They use TestInternal\Data\TestDb.mdf. Do you think this approach will work for your tests as well?

veikkoeeva commented 9 years ago

@sergeybykov

Did you see the 3 SQL Membership table tests I ported? They use TestInternal\Data\TestDb.mdf. Do you think this approach will work for your tests as well?

I did see it. For testing against SQL Server (LocalDb) they work with some definition of work. This is a fair question. Since I don't have a PR yet (I hope to provide one soon, the usual caveats of doing this on the evenings) I can elaborate on this. Please, bear with me here as I take a longer route on my approach.

I have gone perhaps too much on my way and thought Orleans would like to be database vendor agnostic as much as possible and it's not all right to take a dependency on DB vendor dlls unless they are used. Also, there is not much needed here, this is very limited functionality that can be hand-tuned (should be?) and taking a dependency on a heavy-weight ORM would be rather upsetting.

There are a few technical problems here (solved by ORM abstractions):

The connection strings are vendor specific (one may desire to use vendor specific functionality)
The connector libraries are vendor specific.
The lowest common denominator interface, ADO.NET caters much in the way of vendor specific code (e.g. classes derived from framework classes).
How to load on-demand the vendor specific code.
Some syntax is DB specific or not supported by all, such as multi-row INSERT INTO VALUES, but this multi-row insert, for instance, would in my mind be quite essential.

Usually I just make use of an ORM and call it a day (I'm too young to have done otherwise :)), so there has been a bit of filling knowledge of some of the pieces here. Dynamically loading the database vendor libraries and using the connection string can be achieved with DbProviderFactory. Current list of vendors is at ADO.NET Data Providers, I suspect it's not exhaustive.

If you follow the factory link to MSDN, of interest to you and the other core Orleans maintainers would be the configuration sections and where they live, e.g. would they be picked up from any file or does it need to be something specific. A minor point, I think.

Then there's a various assortment of interfaces and abstract classes that are generic, but the usual code samples are vendor specific implementations, e.g. MySqlCommand (which would necessitate having the dlls included upfront) and one pain point are parameters, which can have @, ?, : and whatnot as indicators. This took a bit of time to research, but now I have something to write code with (likely tonight) that hopefully is database agnostic and use the core .NET libraries (should work with Mono too). The usual method to deal with this seem to be a string replacement, but I hope there's better way (will see it when I write code).

As you can see, there is a bit of hassle here before getting into the database. Nevertheless, I believe this is important. Much of this defines how SqlMemberShipTable accesses database, for instance. By extension, this also defines how relational functionality can be tested.

While mdf files work here (by some definition) it also needs to be stored in Git and copied around even while all database engines have functionality to define a new database, which is as fast as reserving disk space for a file of certain size. Doing it on the fly gives, I believe, degrees of freedom. Importantly also, using mdf file it's not possible to test other databases, not even that well across SQL Server compatibility levels (including SQL Server Azure). Let alone other databases, which may or may not get added to test matrix, but nevertheless someone else might want to test privately.

As written, I'm currently in the process of writing some code I hope to share soon. So that we would have a concrete talking point. I'd share what I have currently, but I'm about to refactor it to be more database agnostic. Loading assemblies and defining queries in vendor agnostic way is good, I believe, but there is still a need for vendor specific code. I have currently and interface named IRelationalDataStorage, which would be inherited by each concrete implementation, say SQL Server. If not found, it might be possible to use a generic factory that's not the most efficient one, e.g. does inserts in a loop instead of a batch.

As mentioned, I believe this is important to get code on the right track, but the real meat is on the actual DB, to which I hope to get soon. Though it's all right if someone else gets a good plan before that. It could be deduced from this text here that whatever is the way of inserting (stored procedures, multiple tables etc.) it can be abstracted neatly by vendor.

If for some reason now, or when I get the actual code here, there's something that you'd prefer doing otherwise or taking a speedier alternative than me churning code sluggishly, don't hesitate take action. It's all right. :)

P.S. It might be my new hard drive (not SSD) or its AHCI drivers (I had some problems), but the MSTest test discovery may take even 20 minutes after a rebuild. Or it might not find any and starts the (lengthy search process) only after I restart VS. Visual Studio 2015 seem to be faster, I'm using it – not that it matters. I traced with Xperf, but didn't see anything special. I wonder if it has something to do with copying files around... And in any case MSTest might be a lot faster with Deployment Enabled = false in .runsettings.

sergeybykov commented 9 years ago

I am totally supportive of your desire to get it on the right track from the beginning. At the same time, I'm fine with taking shortcuts here if it makes your progress easier. If you make it vendor-agnostic, that's great. If you decide that a set of vendor specific implementations is easier and more expedient, that would also get us into a much better state than we are at today with regards to supporting SQL-based stores.

veikkoeeva commented 9 years ago

@sergeybykov I think I got the vendor agnostic ideas written down. There may be minor issues, such as the parameter indicators (e.g. @, :), but it can handled later too if that becomes a problem. My hdd is for for some reason acting up, but I put the bits at https://github.com/veikkoeeva/orleans/tree/relationalimprovements/src/TesterInternal/Utilities if you want to take a look at. If you want to take a spin, a short program would be

public struct Test
{
    public int configuration_id { get; set; }

    public string description { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var sqlServer = RelationalConstants.Constants["System.Data.SqlClient"];            
        var db = GenericRelationalStorage.CreateInstance(sqlServer.ProviderInvariantName, sqlServer.DefaultConnectionString);
        var r = db.Read<Test>("SELECT * FROM sys.configurations WHERE configuration_id < @Id", new { Id = 200 }).Result;
        if(!db.ExistsAsync("Testtest").Result)
        {
            db.CreateAsync("Testtest").Wait();
        }
    }
}

I had a test where I loaded the current script to database, but lost it whilst refactoring the interfaces. There isn't currently comments and it isn't used in the test Project and it's lacking some functionality, transactions for one, (which may not even be that important before refactoring the tables a bit) , but I make a PR when I get there. It shouldn't take long, but I need to do something with my hdd. :) The code should be fast and it could be made faster too. If you or anyone has comments, I'd be happy to read them.

As an additional note, I wrote IRelationalStorage to model CRUD and thought that Table Storage does that too, with adding MERGE/PATCH (upsert in relational terms) and at some point in the future it might make sense to create Orleans.Storage and see if all the data access could use an interface defined with "HTTP terms" (just a though, it's well past midnight here).

veikkoeeva commented 9 years ago

@sergeybykov and others, if this initial spike looks all right, I'll add the following:

Comments to the code
Parmeter checks
Logging (probably via constructor parameters, provide a setter too)
Diagnostics and discovery functionality (which providers are found, what SQL is generated and if the provider intended for being used has matching connector libaries loaded, all good for problem shooting on the forums)
Move to code to a more appropriate place (the database script used currently should probably be in some other place)

It may not be evident from that test code, but the connection string used would most likely come from a settings file. ADO.NET has disovery functionality for that, but if the connection strings aren't in the standard path (e.g. exe.config) the file needs to be told explicitly or then the string just picked up otherwise and fed in explicitly (like I do in that code).

With reference to #332, I'd hope this works with mono too.

Ah, in addition I may write utility classes like TypeCache, which includes the reflected types, timestamp when added and source. I didn't spot one for general use, but I would think it'd make the reflection code sprinkled here and there more consistent, including printing diagnostics data.

sergeybykov commented 9 years ago

@veikkoeeva It's not easy for me to evaluate the pieces. They look fine to my superficial glance of not a DB-expert, but @jthelin may have move insight. Once you get to a PR, it should be easier to navigate through the code and try it as part of the codebase.

BTW, should we simply buy you a new hard drive? :-)

veikkoeeva commented 9 years ago

@sergeybykov All right. I just like to confirm this as what I've built is like a nano-ORM, which allows some helpers, but also hand-tuned queries. One option, maybe, could be to use others, like Dapper, but perhaps this is a route of least long-term commitments.

I'll coordinate with @jthelin where to put this stuff. The reason I didn't open a real PR is that the code needs the things I wrote, but it also needs a bit refactoring on the reading and execution logic so as not to repeat the same code (I already misused it a bit in the extensions methods). It'd be too easy to concentrate on wrong issues. I think it's about 15 minutes when I get there during the weekend, which is when I think I have a PR ready as of this time. So, it'll be there on Monday when @jthelin is back to work.

About the hdd. It turned out I've had multiple problems with hardware, OS and their interactions. I got things working, but yesterday evening my computer refused to let me in to BIOS or boot the OS. Time to reset CMOS and see if it helps (otherwise my mobo is ailing). If not, I'll utilize my wife's laptop. :) Otherwise, I think, I'll wait for the new Skylakes, USB 3.1 and DDR4 and SSD NWMe, of which the three last ones are the reasons I've postponed new computer for so long as I need to rebuild almost the whole rig. Thanks for the offer, though, no need this time. :+1:

veikkoeeva commented 9 years ago

@jthelin The code is in. Let me know what you think. I might be a bit sluggish (hence asking a bit of a helping hand if this approach in general is all right) during the week, but I'll try to be available.