dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.61k stars 3.14k forks source link

Set operations over non-entity projections with different facets #19129

Closed yepeekai closed 1 year ago

yepeekai commented 4 years ago

When doing a Union for the same result class but coming from different entity sources or with default values, it fails because of: InvalidOperationException: Set operations over different store types are currently unsupported even when types are identical except for maxLength attribute...

if (innerColumn1.TypeMapping.StoreType != innerColumn2.TypeMapping.StoreType) will fail for nvarchar(100) on one side and nvarchar(max) on the other side...

If I have a nullable string field with maxLength of 100. I can't do a union to something that has different maxlength or default value since default value will be interpreted as nvarchar(max)

I understand that this check is there to prevent conversion from one type to another that may produce unexpected results but I don't think it is the case here since both ends are nvarchar. There is an issue(16298) that tries to solve this in a smart way, but I think it will take some time before it gets released...

I think in the meantime (a a patch) that EF should allow different nvarchar(x) to match where x can vary.

Here is what I am trying to do:

mydbsetA.Select(a => new MyDto { Id = a.Id, Name = a.Name }) //maxLength(100)=>nvarchar(100)
.Union(mydbsetB.Select(b => new MyDto {Id = b.Id, Name = null })) //default to nvarchar(max)
.Union(mydbsetC.Select(c => new MyDto {Id = c.Id, Name = c.Name })) //maxLength(25)=>nvarchar(25)

Further technical details

EF Core version: 3.1 preview 3 Database provider: Microsoft.EntityFrameworkCore.SqlServer Target framework: .NET Core 3.0 Operating system: Windows 7 IDE: Visual Studio 2019 16.3.10

roji commented 4 years ago

16298 is about set operations over different entity types, whereas you are trying to perform a set operation over non-entity projections; the two are different problems, so we can use this issue to track this.

The tricky part of this is to understand database behavior around the differing facets. For example, in SQL Server a set operation over nvarchar(50) and nvarchar(100) seems to yield a type of nvarchar(200) (odd...), we'd have to investigate thoroughly and across databases to see exactly what should be done. Based on the results of that investigation, supporting this could be easy or hard, and therefore may or may not be appropriate for a 3.1 patch - but it's unlikely we'll get around to this anytime soon.

SELECT SQL_VARIANT_PROPERTY((SELECT CAST('foo' AS nvarchar(50)) UNION SELECT CAST('foo' AS nvarchar(100))), 'MaxLength'); -- Result is 200

Note that specifying store types explicitly (#4978) could help here, as you could manually up-cast one side to match the other, will add a note there.

roji commented 4 years ago

Of course, you can use client evaluation to temporarily work around this (by placing AsEnumerable appropriately).

yepeekai commented 4 years ago

Thanks for all the explanations. Not sure why you closed it since it is not related to what I suggested and you said

we can use this issue to track this"

Unfortunately, client evaluation isn't something I want to do in this case because it is a paged source (I only retreive 25 records out of thousands). 4978 is a promising workaround but it is in the backlog :(

yepeekai commented 4 years ago

One additional note: I don't know if it makes sense, but I think EF should use the type of the destination instead of the source in this kind of situation. I am selecting this to a Dto without any kind of length restriction.

roji commented 4 years ago

Sorry for closing, that was a mistake.

Depending on what exactly you're doing, you may be able to get away with adding limits (Take) on both sides of the set operation, and then pulling that and client-evaluating it - but that won't work if you have to apply the limit on the result of the set operation.

I don't know if it makes sense, but I think EF should use the type of the destination instead of the source in this kind of situation. I am selecting this to a Dto without any kind of length restriction.

I understand the logic, but things don't quite work like in the pipeline. The database has a specific, well-defined behavior when typing resulting columns of set operations, and it doesn't depend on what you're going to do with it; we need to understand and match that in EF Core. Also, it wouldn't be good for this to work only if you happen to select this to a DTO without a length restriction - a solution here would make it work in the general case.

smitpatel commented 4 years ago

Related #15586

ViRuSTriNiTy commented 4 years ago

@yepeekai I'm facing this issue too. You can use Convert.ToString(Name) to fool the logic and avoid that unnecessary exception ^^

krajek commented 4 years ago

@ViRuSTriNiTy thanks, Convert.ToString worked in my case as well.

AlbertoMonteiro commented 4 years ago

@ViRuSTriNiTy you are my hero, this is awesome man, worked flawlessly to me too

igorfarias30 commented 4 years ago

Thanks, @ViRuSTriNiTy !!

rruntex commented 3 years ago

But have you seen the monster it creates on execution plan? image

bhupenparikh commented 3 years ago

Thanks i had similar problem and though stack overflow i found link and finally able to get done trough any how with the suggesting above by convert.tostring thanks guys

roji commented 3 years ago

Implementation note: see specifically the case of a set operation over a column and a constant - the constant mapping should maybe be inferred from the column (see #24707).

inforithmics commented 2 years ago

The same problem exists for Nullable DateTime fields. When you assign null or a field My work around was this.

NullableDate = f.Date

Doesn't work NullableDate = null // this is not equal for ef core

Workaround: NullableDate = f .Id != null ? null : f.Date, // pseudo logic to have the same data types for ef core evaluates to null

Barbanyaga commented 2 years ago

For enum my workaround is:

.Select(guestPass => new PassTileInfo
    {
...
        GuestPassStatus = (PassStatus)(int)guestPass.Status,
...
    });
endlessxaura commented 2 years ago

I'm pretty sure the Convert.ToString works because it forces client evaluation, so when it performs the union, it's already evaluated both queries.

eluchsinger commented 2 years ago

Can you make an example of how and where to use the Convert.ToString() method?

DreamU1952 commented 2 years ago

Re: 'The tricky part of this is to understand database behavior around the differing facets.' Why did you not just leave/reuse/migrate the logic from Entity Framework since v5.0 performed this query just fine? Please stop removing functionality that worked in previous versions. Fortunately I have only a few instances of this query to fix in EF core but I had hoped we delayed migrating to EF core long enough that it would be mature by now.

TheConstructor commented 2 years ago

@roji

For example, in SQL Server a set operation over nvarchar(50) and nvarchar(100) seems to yield a type of nvarchar(200) (odd...)

SQL Server regularly reports the maximum byte-length for nvarchar, which is twice what you give as max-length.

select SQL_VARIANT_PROPERTY(CAST('foo' AS nvarchar(50)), 'MaxLength'); --yields 100
select SQL_VARIANT_PROPERTY(CAST('foo' AS nvarchar(100)), 'MaxLength'); --yields 200

-> https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bf89ce29d587928cd242e98e7a51898f

So logic here is quite, what you would expect, but metadata for nvarchar is always *2.

As @DreamU1952 mentions this is a PITA. Providing an option (EF.Function?) to take a certain type to CAST/CONVERT to should already enable the user to solve this (albeit in a quite involved way).

roji commented 1 year ago

@TheConstructor sorry for not answering your comment earlier, that makes sense.

It may be safe to allow the same type with differing facets on both sides, and assume that the larger value wins; so allow nvarchar(100) and nvarchar(max), with the result being nvarchar(max). It may also be safe to do this for non-numeric facets (e.g. non-unicode + unicode = unicode, fixed length + non-fixed length = non-fixed length...). Of course, we'd need to test various scenarios (nvarchar, decimal, datetime) across databases to confirm this assumption.

That would still leave the case of different types on both sides, e.g. xml and nvarchar; these may either be implicitly convertible, or require an explicit cast. To know this we'd need to do https://github.com/dotnet/efcore/issues/15586, and I'm not sure it's a good idea for us to add an explicit cast ourselves, as opposed to the user specifying it (which would require EF.StoreType).

But handling the same-type-different-facets problem seems like it would take care of the majority case.

smitpatel commented 1 year ago

Also see #14179 which is about same type but different facets (albeit in slightly different form)

roji commented 1 year ago

One more thought... Specifically for the case where there's a constant/parameter on one side of the set operation (https://github.com/dotnet/efcore/issues/19129#issuecomment-824148525), we should probably be inferring the type mapping from the other side like we do e.g. with binary operations. This doesn't require any type compatibility charts, and would probably cover quite a few user scenarios.

Opened #29081 to track this.

stevendarby commented 1 year ago

I'm affected by this issue and have up-voted. I really think fixing this is key for operations like Union and Concat to be useful and powerful in EF Core. Thanks for the workarounds given.

One related question: in a DB-first scenario, what functional impact does specifying the max length on string properties have? I'm just wondering if I can fix most issues stemming from this in my solution by removing max length from the entity configurations, so that EF thinks all strings are nvarchar(max), even if not actually that in the DB.

roji commented 1 year ago

@stevendarby max length can be important for validation, i.e. making sure values don't exceed a certain length (but note that in SQL Server the length is in bytes, which may make it less useful for that). More importantly, SQL Server also doesn't allow indexes over nvarchar(max), which is a good reason to specify a facet. But at least in the PostgreSQL world, the facets indeed aren't that useful I'd say.

stevendarby commented 1 year ago

@roji unless I misunderstood, those are reasons to specify lengths in SQL Server. Being DB-first and managing schema manually, I will still do that. But seems that not telling EF about the lengths means it won’t have this problem with Set operations.

roji commented 1 year ago

Oh I see, I didn't understand your question. Yeah, if you're not using EF to create your database, a reasonable workaround may be to let EF think that the columns are simply nvarchar(max).

SoftCircuits commented 1 year ago

Is this caused by the same issue?

System.InvalidOperationException
  HResult=0x80131509
  Message=Unable to translate set operation when matching columns on both sides have different store types.
  Source=Microsoft.EntityFrameworkCore.Relational
  StackTrace:
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplySetOperation(SetOperationType setOperationType, SelectExpression select2, Boolean distinct)
   at Microsoft.EntityFrameworkCore.Query.SqlExpressions.SelectExpression.ApplyUnion(SelectExpression source2, Boolean distinct)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.TranslateUnion(ShapedQueryExpression source1, ShapedQueryExpression source2)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.RelationalQueryableMethodTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
   at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.CreateQueryExecutor[TResult](Expression query)
   at Microsoft.EntityFrameworkCore.Storage.Database.CompileQuery[TResult](Expression query, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass12_0`1.<ExecuteAsync>b__0()
   at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync[TResult](Expression query, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.ExecuteAsync[TResult](Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, Expression expression, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.CountAsync[TSource](IQueryable`1 source, CancellationToken cancellationToken)
   at RailtraxCore.Pager.<ApplyFilterAsync>d__23`1.MoveNext() in D:\Users\jwood\source\repos\Railtrax\RailtraxCore\Pager.cs:line 117

Interestingly, I was able to get my issue work by running some string values through Convert.ToString(). I wish someone could explain what Convert.ToString() is doing that makes the difference. See my question on Stackoverflow.

ajcvickers commented 1 year ago

Is this caused by the same issue?

Looks like it.

I wish someone could explain what Convert.ToString() is doing that makes the difference.

Because EF Core loses track of the mapping, and so doesn't know that the two columns have different facets.