dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.75k stars 3.18k forks source link

Allow specifying whether computed columns are stored or virtual #6682

Closed ctonger closed 4 years ago

ctonger commented 8 years ago

The issue

I am using the HasComputedColumnSql extension which basically works for me as expected.

However, I am missing an option to instruct EF to persist the computed values.

Currently the code that is generated for my computed column is:

[UrlString] AS [Schema] + ':/' + CASE WHEN [DomainPrefix] IS NULL THEN '' ELSE '/' + [DomainPrefix] + '.' END + CASE WHEN [DomainPrefix] IS NULL THEN '/' ELSE '' END + [ApplicationDomain] + CASE WHEN [InternalPath] IS NULL THEN '' ELSE '/' + [InternalPath] END 

However, using an option like .Persisted() or a boolean flag like HasComputedColumnSql(string sql, bool persist) I would like to see code generated like this:

[UrlString] AS [Schema] + ':/' + CASE WHEN [DomainPrefix] IS NULL THEN '' ELSE '/' + [DomainPrefix] + '.' END + CASE WHEN [DomainPrefix] IS NULL THEN '/' ELSE '' END + [ApplicationDomain] + CASE WHEN [InternalPath] IS NULL THEN '' ELSE '/' + [InternalPath] END PERSISTED

It would be an extremely useful feature as it would save time in scripting it manually.

Further technical details

EF Core version: 1.0.0 Operating system: Windows 10 Visual Studio version: VS 2015 Update 3

DanielSSilva commented 5 years ago

After searching for this very same subject (persisted computed column), I've stumbled across this issue. After doing some research and some tests, I've came up to the conclusion that this is possible, just not documented (unless I'm being fooled by something). The part that I consider that's missing on the documentation is that the SQL expression that computes the value for the column must contain the PERSISTED keyword.

entity.Property(e => e.Duration_ms)
      .HasComputedColumnSql("DATEDIFF(MILLISECOND, 0, duration) PERSISTED");

This generated the following on the migration:

migrationBuilder.AddColumn<long>(
            name: "duration_ms",
            table: "MyTable",
            nullable: true,
            computedColumnSql: "DATEDIFF(MILLISECOND, 0, duration) PERSISTED");

To check on the database whether it is actually persisted I ran the following:

select is_persisted, name from sys.computed_columns where is_persisted = 1 and the column that I've created is there.

If I had to guess I would say that this expression is replaced on the generated SQL, such as:

ALTER TABLE MyTable
ADD Duration_ms AS (DATEDIFF(MILLISECOND, 0, duration)) PERSISTED
ajcvickers commented 5 years ago

@DanielSSilva Thanks for this info. Clearing milestone to re-visit in triage.

ajcvickers commented 5 years ago

Triage: being able to add it to the end of the string like this means that we don't really need to do anything here to support this. Leaving this open to document and add a regression test for.

DanielSSilva commented 5 years ago

I don't know the effort/overload required, but I think it would be more user friendly to be able to pass a boolean as parameter, stating if it's persisted

roji commented 5 years ago

@DanielSSilva since this feature is SQL Server-specific, it may make more sense to leave it in the column SQL, which is database-specific anyway...

DanielSSilva commented 5 years ago

Which feature is specific? The whole computed column, or the persisted part? If you are talking about the persisted part, I can agree (although it has to be documented). If the whole .HasComputedColumnSql() thing is SQL Server-specific, in my opinion it could have a bool, since it's already specific, it cannot be re-used.

CZEMacLeod commented 5 years ago

@DanielSSilva @roji @ajcvickers I recon that the persisted setting should be available to all providers, as a hint. If the provider does not support persisting, then ignore the flag. It would mean that a simple computed column would work on different providers.

I was just tripping over a similar issue in that I was trying to set an SQLite computed column and a SQL Server computer column in the model. It looks like under the bonnet these both map back to the same annotation so either overwrite or ignore the second setting.

I think it would be very useful to have the ability to include the provider name with HasComputedColumnSql, and indeed HasDefaultValueSql. It could 'fallback' to the value set without a provider if a specific value is not provided. This could also mean that the SQL Server specific overload for persisted could be added.

entity.Property(e => e.Duration_ms)
      .HasComputedColumnSql("CAST(duration AS int)/1000")
      .HasComputedColumnSql("Microsoft.EntityFrameworkCore.SqlServer","DATEDIFF(MILLISECOND, 0, duration)",true);
ajcvickers commented 5 years ago

@CZEMacLeod Different configuration for different providers is handled by conditionally building different models. For example:

modelBuilder
    .Entity<User>()
    .Property(e => e.Duration_ms)
    .HasComputedColumnSql(
        Database.IsSqlServer()
            ? "DATEDIFF(MILLISECOND, 0, duration) PERSISTED"
            : "CAST(duration AS int)/1000");
CZEMacLeod commented 5 years ago

@ajcvickers Thanks for the heads up on using the Database extension method. I'm still learning EFCore while porting and maintaining EF6 code so finding the new ways of doing things.

The only problem is that I am actually using IEntityTypeConfiguration or an extension method (in lieu of conventions) to do configurations for Types/Interfaces and have the configuration in a separate assembly from the dbcontexts,, as the models are composed and shared between a number of projects (mobile/xamarin and web/net framework).

This means that the Database property is not available, and the type is not exposed on modelBuilder. This also means that the migrations have to be customized and cannot be shared for each database platform.

At the moment, I customize the migration with code similar to

string defaultValueSql = null;
            switch (migrationBuilder.ActiveProvider)
            {
                case "Microsoft.EntityFrameworkCore.Sqlite":
                    defaultValueSql = "CURRENT_TIMESTAMP";
                    break;
                default:
                    defaultValueSql = "GETUTCDATE()";
                    break;
            }

And replace the defaultValueSql and computedColumnSql as required.

columns: table => new 
    {
        DateCreated = table.Column<DateTime>(nullable: false, defaultValueSql: defaultValueSql),

If the provider specific variants were stored in the model/snapshot then you could use something like

builder
    .HasDefaultValueSql("timestamp")
    .HasDefaultValueSql("Microsoft.EntityFrameworkCore.SqlServer","GETUTCDATE()");
    .HasDefaultValueSql("Microsoft.EntityFrameworkCore.Sqlite","CURRENT_TIMESTAMP");

and the migration would become

    DateCreated = table.Column<DateTime>(nullable: false, defaultValueSql:
        (migrationBuilder.ActiveProvider == "Microsoft.EntityFrameworkCore.SqlServer") ?
        "GETUTCDATE()" :
        (migrationBuilder.ActiveProvider == "Microsoft.EntityFrameworkCore.Sqlite") ?
        "CURRENT_TIMESTAMP" :
        "timestamp"),

But obviously there are work-arounds for the current design.

CZEMacLeod commented 5 years ago

@ajcvickers Following up on the last message - it would be useful if the entity types discovered using ApplyConfigurationsFromAssembly could be constructed using DI such that the DatabaseFacade, IOptions<>, ILogging and other services were available during configuration. The existing code limits it stating

Only accept types that contain a parameterless constructor, are not abstract and satisfy a predicate if it was used.

Maybe an overload of the method taking an IServiceProvider to ensure backwards compatibility?

For now I have added a custom interface with a readonly Provider property which I use as the predicate - and have multiple classes, one for each provider. I didn't want to go to using ApplyConfiguration with manually creating each I*TypeConfiguration object passing in Database.ProviderName. Even with that, we still have to generate the migrations multiple times, and keep them in separate assemblies for each provider...

DanielSSilva commented 5 years ago

If I want a column to stop being computed (with or without persited), I remove the .HasComputedColumnSql() . But this will generate a migration with an AlterColumn, which will later throw an exception because a computed column cannot be altered... Shouldn't this generate a "Drop column" & "create column" ?

DanielSSilva commented 5 years ago

@ajcvickers regarding my last comment, should I open a new issue with this question/info?

ajcvickers commented 5 years ago

@DanielSSilva Yes, please. I try to keep up with comments on existing issues, but it's very easy to miss some.

roji commented 5 years ago

FYI PostgreSQL is introducing generated columns in the upcoming version 12. At this point only stored (persisted) generated columns are supported, but on-the-fly computed column support is also expected to land at some point.

Unfortunately the SQL Server workaround above - of tacking on PERSISTED inside the computed SQL - won't work in the PostgreSQL case, since the SQL has to be enclosed in parentheses:

ALTER TABLE x ADD COLUMN y INT GENERATED ALWAYS AS (expression) STORED;

Since the provider adds the parentheses enclosing the expression, anything tacked at the end will end up inside the parentheses.

As a result, from the PostgreSQL side there's definitely motivation for adding a bool flag.

ajcvickers commented 4 years ago

🤦‍♂

Failed executing DbCommand (4ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
CREATE TABLE [Blogs] (
    [Id] int NOT NULL IDENTITY,
    [UpdatedOn] AS getdate() PERSISTED,
    CONSTRAINT [PK_Blogs] PRIMARY KEY ([Id])
);
Microsoft.Data.SqlClient.SqlException (0x80131904): Computed column 'UpdatedOn' in table 'Blogs' cannot be persisted because the column is non-deterministic.
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boo
lean& dataReady)
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean isAsync, Int32 timeout, Boolean asyncWrite)
   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, Stri
ng methodName)
   at Microsoft.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteNonQuery(RelationalCommandParameterObject parameterObject)
   at Microsoft.EntityFrameworkCore.Migrations.MigrationCommand.ExecuteNonQuery(IRelationalConnection connection, IReadOnlyDictionary`2 parameterValues)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.MigrationCommandExecutor.ExecuteNonQuery(IEnumerable`1 migrationCommands, IRelationalConnection connection)
   at Microsoft.EntityFrameworkCore.Migrations.Internal.Migrator.Migrate(String targetMigration)
   at Microsoft.EntityFrameworkCore.Design.Internal.MigrationsOperations.UpdateDatabase(String targetMigration, String contextType)
   at Microsoft.EntityFrameworkCore.Design.OperationExecutor.UpdateDatabaseImpl(String targetMigration, String contextType)
   at Microsoft.EntityFrameworkCore.Design.OperationExecutor.UpdateDatabase.<>c__DisplayClass0_0.<.ctor>b__0()
   at Microsoft.EntityFrameworkCore.Design.OperationExecutor.OperationBase.Execute(Action action)
ClientConnectionId:69552b46-74aa-436b-abd9-8df6c05ac65f
Error Number:4936,State:1,Class:16
Computed column 'UpdatedOn' in table 'Blogs' cannot be persisted because the column is non-deterministic.
roji commented 4 years ago

Same in PG unfortunately:

test.public> CREATE TABLE data (test TIMESTAMP GENERATED ALWAYS AS (CURRENT_TIMESTAMP) STORED)
[2020-01-28 23:32:25] [42P17] ERROR: generation expression is not immutable

(immutable means the function "is guaranteed to return the same results given the same arguments forever", docs).

FWest98 commented 4 years ago

The situation @roji refers to is also the case for MySQL, the expression should be in between brackets (https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html). If there is consensus on this being a useful addition, I might be able to spend some time on it in the near future.

roji commented 4 years ago

@FWest98 thanks for MySQL info. The main thing here is not so much the implementation (which would probably be trivial), as a design proposal for what we should do.

One thing to keep in mind is that the SQL Server "default" is a computed column (unless PERSISTED is manually specified), whereas in PostgreSQL only stored (or persisted) columns are supported (and so that's the default). We may want to keep the main API to mean "whatever type of generated column that makes sense by default on your database" (like today), and add an overload that allows explicitly specifying stored or computed (via a bool flag, or possibly an enum). Specifying a mode that isn't supported by your database would fail validation.

FWest98 commented 4 years ago

For the sake of backwards compatibility, I would personally be in favour of this option indeed. I don't know all SQL dialects, but I think the bool flag in an overload should suffice.

I guess the main work in implementing this is actually updating the providers, but even that should be relatively trivial.

roji commented 4 years ago

Sqlite also supports both generated column types (VIRTUAL and STORED): https://www.sqlite.org/gencol.html

roji commented 4 years ago

See the below table on support across databases.

This feature was only recently introduced in PostgreSQL, and as of PG12 only stored columns are supported (not virtual). As a result, EFCore.PG creates computed columns as stored by default, which is contrary to what everyone else is doing. To fix this, I plan on introducing a breaking change for 5.0 which requires users to explicitly specify they want stored columns (until PG adds support for virtual); this would make this issue more important (this is the means for specifying this). So I've submitted a PR for it.

Database Stored name Virtual name Default Docs
SQL Server PERSISTED <empty> <empty> (virtual) https://docs.microsoft.com/en-us/sql/relational-databases/tables/ specify-computed-columns-in-a-table?view=sql-server-ver15
Sqlite STORED VIRTUAL VIRTUAL https://www.sqlite.org/gencol.html
PostgreSQL STORED VIRTUAL (not yet implemented) None, STORED must currently be specified https://www.postgresql.org/docs/12/ddl-generated-columns.html
MySQL STORED VIRTUAL VIRTUAL https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html