mtanneryd / ef-bulk-operations

Bulk operations for Entity Framework 6
Apache License 2.0
80 stars 30 forks source link

BulkInsertAll "recursive" entities with Many-to-Many relationships seems to repeatedly insert into related entity table. #13

Closed Steven-FlexTechs closed 5 years ago

Steven-FlexTechs commented 5 years ago

Basically when running a Sql Profiler I saw SQL that was inserting into my UnitAmenities over and over. The main entity I am trying to insert is called Units. I also have a join table called Units_UnitAmenities. Primary keys are all Guids.

Here is the SQL that was running over and over. I was hoping the recursion would insert them all at once (bulk) like it normally does on Units without recursion. INSERT INTO [dbo].[UnitAmenities] ([UnitAmenityId],[Name]) SELECT [UnitAmenityId],[Name] FROM tempdb..#b249e294eb5e413e8561a4faf2a3cd9f AS [t0] WHERE NOT EXISTS ( SELECT [UnitAmenityId] FROM [dbo].[UnitAmenities] AS [t1] WHERE [t0].[UnitAmenityId] = [t1].[UnitAmenityId] )

mtanneryd commented 5 years ago

Thanks for the feedback. Interesting error. I have successfully bulk inserted whole hierarchies with join tables but apparently there is still a way to make that not work so well. If you could provide some sample code illustrating what you are doing and how you have configured your entities that would be a huge help.

mtanneryd commented 5 years ago

Oh. Also. I actually fixed something that might be related to your problem recently. If you have not already, please try the latest beta and see if the error is still there.

Steven-FlexTechs commented 5 years ago

I did try out both betas. The following strange error only occurs with the betas in my case:

threw exception: 
Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: Operator '==' cannot be applied to operands of type 'System.Guid' and 'int'
    at CallSite.Target(Closure , CallSite , Object , Int32 )
   at System.Dynamic.UpdateDelegates.UpdateAndExecute2[T0,T1,TRet](CallSite site, T0 arg0, T1 arg1)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary`2 savedEntities, Dictionary`2 mappingsByType, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest`1 request)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)

My project is a bit large. I hope I can find time to create some to create some simplified examples.

mtanneryd commented 5 years ago

I have replicated the error. Will have a look at it this weekend.

mtanneryd commented 5 years ago

There is a new beta release, 1.2.3-beta3. It should do the trick for you. Let me know if the problem persists or if this resolves your issue.

Steven-FlexTechs commented 5 years ago

Thank you for the revisions! I gave it a shot and got a different error when trying with recursive.

threw exception: 
System.Data.SqlClient.SqlException: An object or column name is missing or empty. For SELECT INTO statements, verify each column has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name.
Database name 'tempdb' ignored, referencing object in tempdb.
Database name 'tempdb' ignored, referencing object in tempdb.
    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.CreateTempTable(SqlConnection connection, SqlTransaction transaction, TableName tableName, String[] columnNames, Boolean includeRowNumber)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.FillTempTable(SqlConnection conn, IList entities, TableName tableName, Dictionary`2 columnMappings, TableColumnMapping[] keyColumnMappings, TableColumnMapping[] nonKeyColumnMappings, SqlTransaction sqlTransaction)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkCopy(DbContext ctx, IList entities, Type t, Mappings mappings, SqlTransaction transaction, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary`2 savedEntities, Dictionary`2 mappingsByType, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest`1 request)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)

Quick notes about my setup:

mtanneryd commented 5 years ago

Would it be possible for you to send me the class definitions and the EF Configuration? You can email directly to mans@tanneryd.se. Do you have any nullable properties not configured as optional?

Steven-FlexTechs commented 5 years ago

I made a quick unit test project and database project that recreates my last mentioned error. I am not sure about email file size limitations but, looks like it fits on the github upload. I am using EF6 database first approach so I am not sure about your 'configure as optional' question.

TestingRecursiveBulkInsert.zip

Restating the error:

threw exception: 
System.Data.SqlClient.SqlException: An object or column name is missing or empty. For SELECT INTO statements, verify each column has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name.
Database name 'tempdb' ignored, referencing object in tempdb.
Database name 'tempdb' ignored, referencing object in tempdb.
    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.CreateTempTable(SqlConnection connection, SqlTransaction transaction, TableName tableName, String[] columnNames, Boolean includeRowNumber)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.FillTempTable(SqlConnection conn, IList entities, TableName tableName, Dictionary`2 columnMappings, TableColumnMapping[] keyColumnMappings, TableColumnMapping[] nonKeyColumnMappings, SqlTransaction sqlTransaction)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkCopy(DbContext ctx, IList entities, Type t, Mappings mappings, SqlTransaction transaction, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary`2 savedEntities, Dictionary`2 mappingsByType, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest`1 request)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)
mtanneryd commented 5 years ago

I managed to replicate your error. I have always used DbContext with POCO. Basically using the code first approach even though I initialize the EF model from an existing database. I prefer to use db independent objects and put all configuration in a separate place, thus using the EF fluent API. That way I can control exactly how the EF model relates to the database. My package depends on the information held by the EF DbContext and it seems as if mapping data is stored slightly different when using the .edmx way of doing things.

I'll debug using your test and see if I can figure out how to extract the data I need given your way of using the DbContext and EF.

Thanks for your valuable feedback!

/Måns

Måns Tånneryd Tånneryd IT AB Barrskogsvägen 19 186 53 Vallentuna +46-705140093 https://se.linkedin.com/in/manstanneryd https://www.facebook.com/matkollen

Den sön 3 feb. 2019 kl 07:25 skrev Steven notifications@github.com:

I made a quick unit test project and database project that recreates my last mentioned error. I am not sure about email file size limitations but, looks like it fits on the github upload. I am using EF6 database first approach so I am not sure about your 'configure as optional' question.

TestingRecursiveBulkInsert.zip https://github.com/mtanneryd/ef6-bulk-operations/files/2825237/TestingRecursiveBulkInsert.zip

Restating the error:

threw exception: System.Data.SqlClient.SqlException: An object or column name is missing or empty. For SELECT INTO statements, verify each column has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name. Database name 'tempdb' ignored, referencing object in tempdb. Database name 'tempdb' ignored, referencing object in tempdb. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Tanneryd.BulkOperations.EF6.DbContextExtensions.CreateTempTable(SqlConnection connection, SqlTransaction transaction, TableName tableName, String[] columnNames, Boolean includeRowNumber) at Tanneryd.BulkOperations.EF6.DbContextExtensions.FillTempTable(SqlConnection conn, IList entities, TableName tableName, Dictionary2 columnMappings, TableColumnMapping[] keyColumnMappings, TableColumnMapping[] nonKeyColumnMappings, SqlTransaction sqlTransaction) at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkCopy(DbContext ctx, IList entities, Type t, Mappings mappings, SqlTransaction transaction, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, BulkInsertResponse response) at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary2 savedEntities, Dictionary2 mappingsByType, BulkInsertResponse response) at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest1 request) at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mtanneryd/ef6-bulk-operations/issues/13#issuecomment-460026904, or mute the thread https://github.com/notifications/unsubscribe-auth/ADQZRannxHP4q0nsUf-LAaGzWpF83Zlhks5vJoDFgaJpZM4afG2P .

mtanneryd commented 5 years ago

Hi!

I solved the immediate problem. For some reason the join table table name was not in the same place as for non edmx configs but I got that fixed (no new beta yet though). However. There is another bug related to having Guid primary keys that are generated in code rather than by the database for join tables and the tables they "join". An interesting and very useful setup you have! :-) You managed to punch holes in my little program just where it hurts!

I'll get back to you as soon as I have fixed it.

/Måns

Den sön 3 feb. 2019 kl 07:25 skrev Steven notifications@github.com:

I made a quick unit test project and database project that recreates my last mentioned error. I am not sure about email file size limitations but, looks like it fits on the github upload. I am using EF6 database first approach so I am not sure about your 'configure as optional' question.

TestingRecursiveBulkInsert.zip https://github.com/mtanneryd/ef6-bulk-operations/files/2825237/TestingRecursiveBulkInsert.zip

Restating the error:

threw exception: System.Data.SqlClient.SqlException: An object or column name is missing or empty. For SELECT INTO statements, verify each column has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name. Database name 'tempdb' ignored, referencing object in tempdb. Database name 'tempdb' ignored, referencing object in tempdb. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at Tanneryd.BulkOperations.EF6.DbContextExtensions.CreateTempTable(SqlConnection connection, SqlTransaction transaction, TableName tableName, String[] columnNames, Boolean includeRowNumber) at Tanneryd.BulkOperations.EF6.DbContextExtensions.FillTempTable(SqlConnection conn, IList entities, TableName tableName, Dictionary2 columnMappings, TableColumnMapping[] keyColumnMappings, TableColumnMapping[] nonKeyColumnMappings, SqlTransaction sqlTransaction) at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkCopy(DbContext ctx, IList entities, Type t, Mappings mappings, SqlTransaction transaction, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, BulkInsertResponse response) at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary2 savedEntities, Dictionary2 mappingsByType, BulkInsertResponse response) at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest1 request) at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mtanneryd/ef6-bulk-operations/issues/13#issuecomment-460026904, or mute the thread https://github.com/notifications/unsubscribe-auth/ADQZRannxHP4q0nsUf-LAaGzWpF83Zlhks5vJoDFgaJpZM4afG2P .

Steven-FlexTechs commented 5 years ago

I ran a quick test on 1.2.3-beta4 with my sample project posted previously.

I found that it only worked without the transaction. When I ran with the transaction that I had in the upload I get the following error:

Test method TestingRecursiveBulkInsert.BulkRecusiveTest.TestMethod1 threw exception: 
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.InvalidOperationException: ExecuteNonQuery requires the command to have a transaction when the connection assigned to the command is in a pending local transaction.  The Transaction property of the command has not been initialized.
    at System.Data.SqlClient.SqlCommand.ValidateCommand(String method, Boolean async)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.CreateTempTable(SqlConnection connection, SqlTransaction transaction, TableName tableName, String[] columnNames, IncludeRowNumber includeRowNumber)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkSelectNotExisting[T1,T2](DbContext ctx, BulkSelectRequest`1 request)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkSelectNotExisting[T1,T2](DbContext ctx, BulkSelectRequest`1 request)
--- End of inner exception stack trace ---
    at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkSelectNotExisting(DbContext ctx, Type t, IList entities, TableColumnMapping[] pkColumnMappings)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkCopy(DbContext ctx, IList entities, Type t, Mappings mappings, SqlTransaction transaction, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.DoBulkInsertAll(DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive, Boolean allowNotNullSelfReferences, TimeSpan commandTimeout, Dictionary`2 savedEntities, Dictionary`2 mappingsByType, BulkInsertResponse response)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, BulkInsertRequest`1 request)
   at Tanneryd.BulkOperations.EF6.DbContextExtensions.BulkInsertAll[T](DbContext ctx, IList`1 entities, SqlTransaction transaction, Boolean recursive)
   at TestingRecursiveBulkInsert.BulkRecusiveTest.TestMethod1() in C:\Users\StevenLyon\source\repos\TestingRecursiveBulkInsert\TestingRecursiveBulkInsert\BulkRecusiveTest.cs:line 64

I also noticed BulkInsertAll() stopped Updating existing rows for me. Could that be an added option?

mtanneryd commented 5 years ago

Sorry about that. I forgot to pass on the transaction when calling the new BulkSelectNotExisting using runtime binding to generic methods. Tested it with your project and it works now. About the updating of existing rows I think that was an unintentional side effect that might have gone now. Implementing it for real has been on my backlog for a while. Adding it as an option would be the right way to go. I'll look into it. There is yet another beta available now. Sorry about the poor quality of my updates and thanks a lot for helping me out finding bugs!

Steven-FlexTechs commented 5 years ago

Thanks so much for the updates. I did have trouble with the Recursive option. I scaled up my test to around 10,000 "Units" and 25 unique "UnitAmenities" joined in. I found that the completion time came to a crawl and my sql profile was repeating the following highlighted section tons of times. (Attached screenshot)

bulkinsertrecusive20190210

mtanneryd commented 5 years ago

Very interesting. Thanks for your feedback! I really appreciate it.

mtanneryd commented 5 years ago

Good catch! Usually when we recursively insert entities with fk relationships we're being reasonably smart about it. But, for join tables the situation is a little different and as it turns out we're not being smart at all from a performance perspective. I found the problem and will fix it as soon as I can.

mtanneryd commented 5 years ago

There is another beta out 1.2.3-beta6. I have not been able to test it enough yet but initial tests indicate that it solves the performance problem, at least. If you have the time, feel free to check it out.

mtanneryd commented 5 years ago

As far as I can tell this now works. Please repoen the issue if the problems reappear.