dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.63k stars 3.15k forks source link

Use column/property facets for parameter types in Update Pipeline #4134

Closed michaelpaulus closed 8 years ago

michaelpaulus commented 8 years ago

Title

HasMaxLength / HasColumnType / ForSqlServerHasColumnType don't produce parameters with that size

Functional impact

Data Types on generated parameters don't match actual sql data types and don't work as expected. This can lead to indexes not being used properly due to size mismatches between parameters and columns.

Minimal repro steps

  1. Create an EF 7 Project with Sql Server
  2. Add a model with a key and a string property
  3. Configure the model's string property to use HasMaxLength(200) or HasColumnType("nvarchar(200)") or ForSqlServerHasColumnType("nvarchar(200)")
  4. Query against the string column in the model (e.g. where stringcolumn == strinvalue)
  5. Look at sql server profiler and see that the parameter is created with nvarchar(4000) not nvarchar(200)
  6. Update the string column as savechanges
  7. Look at the sql server profile and see that the parameter is created with nvarchar(4000) not nvarchar(200)

    Expected result

Parameter is created with the size specified by HasMaxLength or HasColumnType

Actual result

Parameter is always being created with the _maxSpecificSize in Microsoft.Data.Entity.Storage.Internal.SqlServerMaxLengthMapping

Further technical details

There is a comment in the method Microsoft.Data.Entity.Storage.Internal.SqlServerMaxLengthMapping.ConfigureParameter

// For strings and byte arrays, set the max length to 8000 bytes if the data will // fit so as to avoid query cache fragmentation by setting lots of different Size // values otherwise always set to -1 (unbounded) to avoid SQL client size inference.

I don't agree with this comment if the user is going through the trouble of telling you what size to use.

ErikEJ commented 8 years ago

Do you have evidence of indexes not being used?

michaelpaulus commented 8 years ago

I'm sorry, I was going off what I learned about db design, not an actual use case. I cannot repo the issue and after research (http://stackoverflow.com/questions/14342954/ado-net-safe-to-specify-1-for-sqlparameter-size-for-all-varchar-parameters) the functional impact seems to be wrong. If the developer goes through the trouble of telling you the parameter size I think you should use it, instead of 4000 / 8000. This will still avoid query cache fragmentation but produce the query that the developer expects.

Thanks.

ErikEJ commented 8 years ago

But HasMaxLength etc does not define a parameter size, but a storage max size.

divega commented 8 years ago

@ErikEJ Exactly. Theoretically for very simple patterns that compare a parameter for equality against a column we could borrow the size of the column to create the parameter, but in the general case (i.e. for abitrary expressions) it could become harder to keep track of how they relate. Even for the simple case, what happens if the data in the parameter doesn't fit in the size of the column? Should we truncate it and potentially return wrong results? Should we instead have the smarts to short-circuit the comparison on the client knowing that the parameter could never match any value in the column?

@michaelpaulus I understand your expectation here but I believe it is not as straightforward as you may think. Rather than having a set of complicated rules to try to meet that expectation when we can, we use simpler rules that functionally work in all cases and have desirable effects on query caching on the server. By the way, these rules are also proven. We have used them in previous O/RMs.

michaelpaulus commented 8 years ago

In EF6 the parameter length was defined by the maxlength for insert / update params. This is no longer the case in EF7. The parameter length wasn't defined by the maxlength for select params in both.

EF6 Output:

exec sp_executesql N'UPDATE [dbo].[customer]
SET [Name] = @0
WHERE ([CustomerID] = @1)
',N'@0 nvarchar(200),@1 int',@0=N'T1',@1=140681

EF7 Output

exec sp_executesql N'SET NOCOUNT ON;
UPDATE [customer] SET [Name] = @p1
WHERE [CustomerID] = @p0;
SELECT @@ROWCOUNT;
',N'@p0 int,@p1 nvarchar(4000)',@p0=140681,@p1=N'T2'

I'm not necessarily asking about maxlength, although this was a departure from ef6, but more to the point I've declared the datatype explicitly using HasColumnType("nvarchar(200)") or ForSqlServerHasColumnType("nvarchar(200)") and I get nvarchar(4000).

As far as what you would do if the value passed in is greater than the length specified, you already have this logic, but for values that are greater than 4000. I would suggest let the value be the length specified unless the value is greater then go to 4000 unless the value is greater than 4000 then go to -1 (max).

The only other question I would have, does this have any affect on sql server and memory allocation? If I have a table that has 50 nvarchar(25) columns and they are all being inserted with parameters of nvarchar(4000), does this change the amount of memory sql server allocates for all of those params?

In my short testing it seems that it does. The one with nvarchar(4000) params adds an additional compute scalar step to the execution plan with a much larger row size.

Compare

exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [t1] ([T1], [T10], [T11], [T12], [T13], [T14], [T15], [T16], [T17], [T18], [T19], [T2], [T20], [T21], [T22], [T23], [T24], [T25], [T26], [T27], [T28], [T29], [T3], [T30], [T31], [T32], [T33], [T34], [T35], [T36], [T37], [T38], [T39], [T4], [T40], [T41], [T42], [T43], [T44], [T45], [T46], [T47], [T48], [T49], [T5], [T50], [T6], [T7], [T8], [T9])
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34, @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42, @p43, @p44, @p45, @p46, @p47, @p48, @p49);
SELECT [Id]
FROM [t1]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();
',N'@p0 nvarchar(4000),@p1 nvarchar(4000),@p2 nvarchar(4000),@p3 nvarchar(4000),@p4 nvarchar(4000),@p5 nvarchar(4000),@p6 nvarchar(4000),@p7 nvarchar(4000),@p8 nvarchar(4000),@p9 nvarchar(4000),@p10 nvarchar(4000),@p11 nvarchar(4000),@p12 nvarchar(4000),@p13 nvarchar(4000),@p14 nvarchar(4000),@p15 nvarchar(4000),@p16 nvarchar(4000),@p17 nvarchar(4000),@p18 nvarchar(4000),@p19 nvarchar(4000),@p20 nvarchar(4000),@p21 nvarchar(4000),@p22 nvarchar(4000),@p23 nvarchar(4000),@p24 nvarchar(4000),@p25 nvarchar(4000),@p26 nvarchar(4000),@p27 nvarchar(4000),@p28 nvarchar(4000),@p29 nvarchar(4000),@p30 nvarchar(4000),@p31 nvarchar(4000),@p32 nvarchar(4000),@p33 nvarchar(4000),@p34 nvarchar(4000),@p35 nvarchar(4000),@p36 nvarchar(4000),@p37 nvarchar(4000),@p38 nvarchar(4000),@p39 nvarchar(4000),@p40 nvarchar(4000),@p41 nvarchar(4000),@p42 nvarchar(4000),@p43 nvarchar(4000),@p44 nvarchar(4000),@p45 nvarchar(4000),@p46 nvarchar(4000),@p47 nvarchar(4000),@p48 nvarchar(4000),@p49 nvarchar(4000)',@p0=N'Test',@p1=N'Test',@p2=N'Test',@p3=N'Test',@p4=N'Test',@p5=N'Test',@p6=N'Test',@p7=N'Test',@p8=N'Test',@p9=N'Test',@p10=N'Test',@p11=N'Test',@p12=N'Test',@p13=N'Test',@p14=N'Test',@p15=N'Test',@p16=N'Test',@p17=N'Test',@p18=N'Test',@p19=N'Test',@p20=N'Test',@p21=N'Test',@p22=N'Test',@p23=N'Test',@p24=N'Test',@p25=N'Test',@p26=N'Test',@p27=N'Test',@p28=N'Test',@p29=N'Test',@p30=N'Test',@p31=N'Test',@p32=N'Test',@p33=N'Test',@p34=N'Test',@p35=N'Test',@p36=N'Test',@p37=N'Test',@p38=N'Test',@p39=N'Test',@p40=N'Test',@p41=N'Test',@p42=N'Test',@p43=N'Test',@p44=N'Test',@p45=N'Test',@p46=N'Test',@p47=N'Test',@p48=N'Test',@p49=N'Test'

with

exec sp_executesql N'SET NOCOUNT ON;
INSERT INTO [t1] ([T1], [T10], [T11], [T12], [T13], [T14], [T15], [T16], [T17], [T18], [T19], [T2], [T20], [T21], [T22], [T23], [T24], [T25], [T26], [T27], [T28], [T29], [T3], [T30], [T31], [T32], [T33], [T34], [T35], [T36], [T37], [T38], [T39], [T4], [T40], [T41], [T42], [T43], [T44], [T45], [T46], [T47], [T48], [T49], [T5], [T50], [T6], [T7], [T8], [T9])
VALUES (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17, @p18, @p19, @p20, @p21, @p22, @p23, @p24, @p25, @p26, @p27, @p28, @p29, @p30, @p31, @p32, @p33, @p34, @p35, @p36, @p37, @p38, @p39, @p40, @p41, @p42, @p43, @p44, @p45, @p46, @p47, @p48, @p49);
SELECT [Id]
FROM [t1]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();
',N'@p0 nvarchar(25),@p1 nvarchar(25),@p2 nvarchar(25),@p3 nvarchar(25),@p4 nvarchar(25),@p5 nvarchar(25),@p6 nvarchar(25),@p7 nvarchar(25),@p8 nvarchar(25),@p9 nvarchar(25),@p10 nvarchar(25),@p11 nvarchar(25),@p12 nvarchar(25),@p13 nvarchar(25),@p14 nvarchar(25),@p15 nvarchar(25),@p16 nvarchar(25),@p17 nvarchar(25),@p18 nvarchar(25),@p19 nvarchar(25),@p20 nvarchar(25),@p21 nvarchar(25),@p22 nvarchar(25),@p23 nvarchar(25),@p24 nvarchar(25),@p25 nvarchar(25),@p26 nvarchar(25),@p27 nvarchar(25),@p28 nvarchar(25),@p29 nvarchar(25),@p30 nvarchar(25),@p31 nvarchar(25),@p32 nvarchar(25),@p33 nvarchar(25),@p34 nvarchar(25),@p35 nvarchar(25),@p36 nvarchar(25),@p37 nvarchar(25),@p38 nvarchar(25),@p39 nvarchar(25),@p40 nvarchar(25),@p41 nvarchar(25),@p42 nvarchar(25),@p43 nvarchar(25),@p44 nvarchar(25),@p45 nvarchar(25),@p46 nvarchar(25),@p47 nvarchar(25),@p48 nvarchar(25),@p49 nvarchar(25)',@p0=N'Test',@p1=N'Test',@p2=N'Test',@p3=N'Test',@p4=N'Test',@p5=N'Test',@p6=N'Test',@p7=N'Test',@p8=N'Test',@p9=N'Test',@p10=N'Test',@p11=N'Test',@p12=N'Test',@p13=N'Test',@p14=N'Test',@p15=N'Test',@p16=N'Test',@p17=N'Test',@p18=N'Test',@p19=N'Test',@p20=N'Test',@p21=N'Test',@p22=N'Test',@p23=N'Test',@p24=N'Test',@p25=N'Test',@p26=N'Test',@p27=N'Test',@p28=N'Test',@p29=N'Test',@p30=N'Test',@p31=N'Test',@p32=N'Test',@p33=N'Test',@p34=N'Test',@p35=N'Test',@p36=N'Test',@p37=N'Test',@p38=N'Test',@p39=N'Test',@p40=N'Test',@p41=N'Test',@p42=N'Test',@p43=N'Test',@p44=N'Test',@p45=N'Test',@p46=N'Test',@p47=N'Test',@p48=N'Test',@p49=N'Test'

for table

/****** Object:  Table [dbo].[t1]    Script Date: 12/19/2015 12:33:13 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[t1](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [t1] [nvarchar](25) NULL,
    [t2] [nvarchar](25) NULL,
    [t3] [nvarchar](25) NULL,
    [t4] [nvarchar](25) NULL,
    [t5] [nvarchar](25) NULL,
    [t6] [nvarchar](25) NULL,
    [t7] [nvarchar](25) NULL,
    [t8] [nvarchar](25) NULL,
    [t9] [nvarchar](25) NULL,
    [t10] [nvarchar](25) NULL,
    [t11] [nvarchar](25) NULL,
    [t12] [nvarchar](25) NULL,
    [t13] [nvarchar](25) NULL,
    [t14] [nvarchar](25) NULL,
    [t15] [nvarchar](25) NULL,
    [t16] [nvarchar](25) NULL,
    [t17] [nvarchar](25) NULL,
    [t18] [nvarchar](25) NULL,
    [t19] [nvarchar](25) NULL,
    [t20] [nvarchar](25) NULL,
    [t21] [nvarchar](25) NULL,
    [t22] [nvarchar](25) NULL,
    [t23] [nvarchar](25) NULL,
    [t24] [nvarchar](25) NULL,
    [t25] [nvarchar](25) NULL,
    [t26] [nvarchar](25) NULL,
    [t27] [nvarchar](25) NULL,
    [t28] [nvarchar](25) NULL,
    [t29] [nvarchar](25) NULL,
    [t30] [nvarchar](25) NULL,
    [t31] [nvarchar](25) NULL,
    [t32] [nvarchar](25) NULL,
    [t33] [nvarchar](25) NULL,
    [t34] [nvarchar](25) NULL,
    [t35] [nvarchar](25) NULL,
    [t36] [nvarchar](25) NULL,
    [t37] [nvarchar](25) NULL,
    [t38] [nvarchar](25) NULL,
    [t39] [nvarchar](25) NULL,
    [t40] [nvarchar](25) NULL,
    [t41] [nvarchar](25) NULL,
    [t42] [nvarchar](25) NULL,
    [t43] [nvarchar](25) NULL,
    [t44] [nvarchar](25) NULL,
    [t45] [nvarchar](25) NULL,
    [t46] [nvarchar](25) NULL,
    [t47] [nvarchar](25) NULL,
    [t48] [nvarchar](25) NULL,
    [t49] [nvarchar](25) NULL,
    [t50] [nvarchar](25) NULL,
 CONSTRAINT [PK_t1] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

Thanks again for reviewing this.

ErikEJ commented 8 years ago

So the issue is not indexes not being used, but potential increased memory consumption server side in the update pipeline?

michaelpaulus commented 8 years ago

The issue is HasMaxLength / HasColumnType / ForSqlServerHasColumnType don't produce parameters with that size. In select parameters this is the same as EF6. In Insert / Update Parameters this is different then EF6. I pointed out both issues in the repo steps 4/5 select, 6/7 update. I'm not sure this is an issue or not, but defiantly not expected behavior from the developers perspective. The functional impact I originally wrote is wrong.

I'm just bringing up the point of the insert / update being different then EF6 and asking the question if this has an impact on server memory. It seems to have an issue but I'm no expert in this area.

Thanks.

divega commented 8 years ago

@michaelpaulus thanks for clarifying that you concern is about the update pipeline. I missed that detail in your original post. I agree that this could have memory and perf implications on the server that we may need to take into account. I also agree that by construction parameters generated in the update pipeline have a more direct correspondence to columns.

ErikEJ commented 8 years ago

@divega I see improved performance (less CPU and a couple of ms faster response) by using the parameter optimized query, and given your focus on few CPU cycles and fast response times, this might be worth persuing. The overhead is due to: CONVERT_IMPLICIT(nvarchar(25),[@p2],0) for each parameter.

divega commented 8 years ago

Clearing milestone because I would like to chat about this one in triage again (I probably missed the discussion while I was out).

rowanmiller commented 8 years ago

assigning to @ajcvickers to see if there are any easy wins for RC2 here

divega commented 8 years ago

We should consider borrowing the unicodeness of the column as well as the length. See #4608.

ajcvickers commented 8 years ago

Marking for re-triage. No priority has been set.

divega commented 8 years ago

BTW, regarding my previous comment on borrowing both unicodeness and length, as per https://github.com/aspnet/EntityFramework/issues/4949#issuecomment-208605611, it seems that we area already doing the right thing for the unicodeness.