dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.59k stars 3.14k forks source link

Recognize filters created by fluent API and reverse engineer to fluent API. #10183

Open azabluda opened 6 years ago

azabluda commented 6 years ago

I define a unique index for a entity Folder with a nullable property Folder.FolderId (link to parent folder)

modelBuilder.Entity<Folder>().HasIndex(f => new { f.Name, f.FolderId }).IsUnique();

The debugger shows a Relational:Filter "[FOLDER_ID] IS NOT NULL" annotation defined on that index. The generated DDL looks fine

CREATE UNIQUE INDEX [IX_FOLDERS_NAME_FOLDER_ID] ON [ICNG].[FOLDERS] ([NAME], [FOLDER_ID]) WHERE [FOLDER_ID] IS NOT NULL;

Now if I reverse engineer this database

var reporter = new TestOperationReporter();
var svcColl = new ServiceCollection()
    .AddLogging()
    .AddSingleton<IOperationReporter>(reporter)
    .AddScaffolding(reporter);
new SqlServerDesignTimeServices().ConfigureDesignTimeServices(svcColl);
IServiceProvider prov = svcColl.BuildServiceProvider();

var scaffoldingModelFactory = prov.GetService<IScaffoldingModelFactory>();
var scaffoldedModel = scaffoldingModelFactory.Create(
    ConnectionString,
    Enumerable.Empty<string>(),
    Enumerable.Empty<string>(),
    true);

then the Relational:Filter becomes "([FOLDER_ID] IS NOT NULL)" (note the two round brackets). Diffing the original and the scaffolded models yields unnecessary DROP/CREATE INDEX operations, which at first glance doesn't look correct. Wouldn't it make sense to reduce the Relational:Filter to a canonical form during scaffolding?

Further technical details

EF Core version: 2.0.0 Database Provider: Microsoft.EntityFrameworkCore.SqlServer Operating system: Windows 10 Pro IDE: Visual Studio 2017

bricelam commented 6 years ago

Unfortunately, reducing expressions returned by the database requires parsing Transact-SQL. We tried this in the past, but it was very error-prone, so we decided to avoid doing it for now.

From what I remember, the parenthesis were actually required in certain cases, so we couldn't just blindly strip them from every expression.

ajcvickers commented 6 years ago

@azabluda In addition to the explanation from @bricelam, can I ask why you are diffing the original and scaffolded models? What is the workflow that results in a need to diff these two things? Also, related, why are you reverse engineering the model from the database of Migrations are being used to update the database?

(I'm not saying these things are inherently wrong, but they seem unusual, so it would be great if we could understand the scenarios/requirements that lead to you doing these things.)

Closing this as by-design for now, but if we get additional information we could reconsider.

azabluda commented 6 years ago

@bricelam thank you for the explanation! I fully agree that parsing Transact-SQL is a pain, but the problem seems to be easier to address from another end. Let's simply take the form of Relational:Filter returned by the database for the canonical one, and just make sure the expression generated by the ModelBuilder also includes parenthesis. Alternatively one could slightly relax the diffing method by equating "(expr)" and "expr" and hence emitting a no-op.

@ajcvickers I'm currently investigating the options for those of our projects where the use of out-of-the-box Migrations is problematic. There may be various reasons, both technical and non-technical, objective and subjective, historical, etc.

The standard EF approach to database migrations is to maintain the history of increments, which may lead to rather fragile workflows if f.ex. we are not always in full control over all database schemas we are deploying to. In the past we used to have a custom solution for such environments which followed the alternative approach. Instead of applying the increments we used to compare the target datamodel against the current one to dynamically determine the migration steps. This method proved to be very reliable in general, and with some neatly designed extensions we managed to overcome most of its obvious drawbacks, e.g. the inherent inability to perform a trivial "table/column rename" migration without a lossy DROP/CREATE operation.

After spending some time studying the code of EFCore I came to a conclusion that most mechanisms to enable the alternative workflow were already there. In fact my first tests largely exceeded my own expectations (great job guys! 👍 no seriously, you're awesome!). I only observed some minor issues like #10134 and #9188, but generally I'm looking very positive into using EFCore built-in scaffolding and diffing instead of our custom tool. I also notice your team's regularly touching the topic of roundtripability of RevEng and Migrations, which sounds quite encouraging to me.

I generally agree with your 'closed-by-design' resolution. After all I can workaround the problem by applying some custom pre-processing to both models prior to diffing. Yet if the EFCore team is gradually moving towards roundtripability anyway, then I would gladly mark my workarounds [Obsolete]. 😅

ErikEJ commented 6 years ago

@azabluda Have you looked at SSDT database projects? They allow the workflow you describe (I think) ?

ajcvickers commented 6 years ago

@azabluda Thanks for the information. It will help us make decisions going forward with regards to round-tripping.

ajcvickers commented 6 years ago

Reopening so we can discuss again as a team. (No commitment to change anything at this time.)

azabluda commented 6 years ago

@ErikEJ Thanks for the hint! I need to check how we can leverage SSDT database projects while still defining the model primarily with EFCore CodeFirst. Could be a bit more challenging though to find a similar technology for our Oracle based customers... 🤔

@ajcvickers Ah, glad to hear that! It really feels like EFCore came fairly close to a very solid solution, so not doing the final touch ups would be a pity. Linking #3599.

ajcvickers commented 6 years ago

After further discussion in triage we realized that reverse engineering should recognize that "([FOLDER_ID] IS NOT NULL)" is the consequence of having modelBuilder.Entity<Folder>().HasIndex(f => new { f.Name, f.FolderId }).IsUnique(); in the code. We should then have reverse engineering generate those API calls, which in turn will make the model snapshot match the new model, even after reverse engineering.

@azabluda Do you think this would fix the issue you are seeing? It depends somewhat on exactly what you are diffing.

azabluda commented 6 years ago

@ajcvickers Your statement is certainly correct, but depending on the actual implementation it may or may not necessarily fix the original issue. At least I can imagine both implementations.

I apologize for not making my point clear from the very beginning. I actually do not reverse engineer the database into C# code, but I only scaffold the IModel using IScaffoldingModelFactory.Create.

Let us consider a simple degenerate case of my (generally broader) scenario. I start with the DbContext.Model defined in the code and I create a new database from it. If I then immediately scaffold this database using the code snippet from my original post, then I get the scaffoldedModel for which the following is valid

The idea behind all this is that I want to be able to compare an arbitrary database with my DbContext.Model and generate a list of DDL commands which would let me migrate said database to what my DbContext.Model can work with. May sound too ambitious but in practical cases the database doesn't differ too much from the expected one, so the goal is often quite achievable. However it would be very natural to expect such mechanism to be rerun-proof, so it should generate no commands if the database is already fully compatible with the model (e.g. when I've just created it from that model a minute ago).

If you wish, I can create and share with you a simple repro.

ajcvickers commented 6 years ago

@azabluda I suspect that you may have to either:

The latter may happen with the proposed fix, but like you said it depends on how it is implemented.

smitpatel commented 6 years ago

If the difference is only in Index.Filter then it could possibly work.

When building code first model, conventions/data-annotation/fluent API populates whole model to have all info required (including setting annotations). When reverse engineering the model, ScaffoldingModelFactory, creates a fully populated model which has all annotations/metadata necessary with setting annotations explicitly which would be different from conventions.

So ideally those 2 models would be same (except for filter thing we know above).

bricelam commented 6 years ago

@azabluda your scenario is very similar to feature #831. We'll strive to keep the reverse engineered model as close to the code-first model as possible while working on it.

azabluda commented 6 years ago

@ajcvickers I'm going to proceed with your first suggestion as a workaround for now. For the permanent solution I still believe it would be much easier to make the standard code first model builder include parentheses rather than to teach RevEng to strip them out. But I certainly may not see the whole picture, so I trust you find the best solution.

@bricelam Thanks for the link. I'm encouraged to see it among stretch goals of EFCore team. Although I'm still not planning to really generate the actual code from the database, I'm sure this effort will have a general positive effect on the consistency between IModel's generated by the model builder and reverse engineering.

ajcvickers commented 6 years ago

@bricelam to add a note as to why this is hard to fix.

bricelam commented 6 years ago

Today, the provider doesn't play a role in deciding whether "core" configuration would be applied by convention. We need to add additional provider APIs to enable this.