mtanneryd / ef-bulk-operations

Bulk operations for Entity Framework 6
Apache License 2.0
80 stars 30 forks source link

Exponential slowdown on parallel executions #33

Open Hugibeer opened 4 years ago

Hugibeer commented 4 years ago

Hello.

We are using this library for bulk insert operations in our .NET 4.7 application. We noticed that, when multiple users. So, I typed out a little application which inserts million of entries in a chunks of 4000 items, you can find it in this repo https://github.com/Hugibeer/TannerydSample I was shocked to see how slow it is when executing bulk insert operation on multiple threads, in https://github.com/Hugibeer/TannerydSample/tree/master/Measurements you can see that, when executing this operaiton in 10 threads, execution per thread is slowed down to 235 seconds, or so. It seems there is something seriously throttling these operations.

I ran 1000 iterations on SQLBulkCopy and SQL procedur inserts, measurements with rudimentary analysis can be read here https://github.com/Hugibeer/TannerydSample/tree/master/AutomatedMeasurements I am currently running 10 iterations of only ef6 bulk operations and will add those files to the AutomatedMeasurements folder in the repo.

Just to make things clear, this isn't "hate post", I only want to raise an attention to you to these details.

Thank you

Hugibeer commented 4 years ago

For comparison, on test PC I am using, SQLBulkCopy and SQL procedure, on 10 thread parallel executions have average of around 30 seconds, while ef6 bulk operations is at 113 seconds

mtanneryd commented 4 years ago

Hi!

Thanks for the feedback. I'll have a look as soon as I can. I'm currently trying to finalize the next release so I'm a bit busy but I'll be happy to go thru your info as soon as I can. Again, Thanks!

Måns Tånneryd Tånneryd IT AB Barrskogsvägen 19 186 53 Vallentuna +46-705140093 https://se.linkedin.com/in/manstanneryd

https://github.com/mtanneryd/ef6-bulk-operations https://github.com/mtanneryd/ef6-bulk-operations

Den ons 29 apr. 2020 kl 12:23 skrev Miloš Trifunović < notifications@github.com>:

Hello.

We are using this library for bulk insert operations in our .NET 4.7 application. We noticed that, when multiple users. So, I typed out a little application which inserts million of entries in a chunks of 4000 items, you can find it in this repo https://github.com/Hugibeer/TannerydSample I was shocked to see how slow it is when executing bulk insert operation on multiple threads, in https://github.com/Hugibeer/TannerydSample/tree/master/Measurements you can see that, when executing this operaiton in 10 threads, execution per thread is slowed down to 235 seconds, or so. It seems there is something seriously throttling these operations.

I ran 1000 iterations on SQLBulkCopy and SQL procedur inserts, measurements with rudimentary analysis can be read here https://github.com/Hugibeer/TannerydSample/tree/master/AutomatedMeasurements I am currently running 10 iterations of only ef6 bulk operations and will add those files to the AutomatedMeasurements folder in the repo.

Just to make things clear, this isn't "hate post", I only want to raise an attention to you to these details.

Thank you

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mtanneryd/ef6-bulk-operations/issues/33, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2BSRJDTHQK6RJQW6SQGF3RO752JANCNFSM4MTTASAA .

Hugibeer commented 4 years ago

If you need any input from me, please let me know. Thank you

Hugibeer commented 4 years ago

It is cleare to me why the library is slower than SqlBulkCopy and SQL procedure approaches, it does two DB orundtrips, one to insert in temp table, the other to insert into real table and hydrate database generated IDs. I am using slightly older version of library in the repository, I tried updating it, but still got the same performances from it.

mtanneryd commented 4 years ago

The double roundtrip (using the temp table) is unfortunately required in order to safely retrieve the generated primary keys. If you do not need to retrieve these in the bulk insert you can disable this in the request. Use the enum value EnableRecursiveInsert.NoAndIgnoreGeneratedPrimaryKeys for the property EnableRecursiveInsert in BulkInsertRequest.

Måns Tånneryd Tånneryd IT AB Barrskogsvägen 19 186 53 Vallentuna +46-705140093 https://se.linkedin.com/in/manstanneryd

https://github.com/mtanneryd/ef6-bulk-operations https://github.com/mtanneryd/ef6-bulk-operations

Den mån 4 maj 2020 kl 08:55 skrev Miloš Trifunović <notifications@github.com

:

It is cleare to me why the library is slower than SqlBulkCopy and SQL procedure approaches, it does two DB orundtrips, one to insert in temp table, the other to insert into real table and hydrate database generated IDs. I am using slightly older version of library in the repository, I tried updating it, but still got the same performances from it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mtanneryd/ef6-bulk-operations/issues/33#issuecomment-623291465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2BSRI26TFVGPWS6NHGP33RPZREVANCNFSM4MTTASAA .