Closed MaklaCof closed 2 years ago
You can do that with linq2db extension right now if you can't wait EF Core implementation.
With this extension you will be able to insert any query in any table, even in temporary and then reuse it in linq query. And for reporting you will quickly find that you need more SQL extensions which are provided by linq2db out of the box.
@MaklaCof When you say, "In my case, the factor between SaveChanges and BulkSaveChanges is even greater then," can you provide some details? What have you tested and what were the results?
@MaklaCof Also, can you post the version of EF Core that you are using, and the database provider that you are using?
Of course. I will need a day or two. I am on the tight schedule until Wednesday.
Out of my head, version was 2.1 (currently I am on 2.2), and provider was SQL Server express 2016 (currently I use SQL Server Developer 2017).
Hi,
I made a demo in which ratio is not so much, but the case is simple and repeatable. I think that these numbers already indicate that there is room for improvement in performance.
The difference I mentioned occurs when the table has 20-50 columns, and it has several relational data and not just one as in a demo.
RC: Records count
SC: Save changes
BSC: Bulk save changes
F: Factor betwen SaveChanges / BulkSaveChanges
DSC: Save changes with one detail table
DBSC: Bulk save changes with one detail table
DF: Factor betwen SaveChanges / BulkSaveChanges with one detail table
RC SC BSC F DSC DBSC DF
10 0.1s 0.1s 0.5 0.1s 0.1s 0.4
100 0.0s 0.1s 0.3 1.0s 0.4s 2.9
1000 0.5s 0.2s 3.2 10.8s 2.6s 4.2
10000 5.2s 1.3s 4.1 106.8s 22.7s 4.7
100000 51.5s 11.5s 4.5 / / /
1000000 514.5s 129.2s 4.0 / / /
As we can see, with 10000 rows, BulkSaveChanges
is 4-5 times faster.
Database is SQL Server Developer 2017 in local network on HP server with two 1Gb network card.
Here is source code.
This seems to be comparing EF Core's regular SaveChanges()
to a 3rd-party product, which, apart from providing bulk CUD operations (#795), may be implementing inserts in another way (e.g. SqlBulkCopy?). If that's the case then note similarity with https://github.com/npgsql/Npgsql.EntityFrameworkCore.PostgreSQL/issues/113, which would be about implementing SaveChanges()
with PostgreSQL's optimized COPY protocol.
@MaklaCof Thanks for the additional information. A few comments:
Hi, maybe just one small thing. There are scenarios where you don't need to track inserted entities. Not even get back Id (identity). Maybe some options to SaveChanges to skip tracking this entities, and not receiving back Id would be legitimate use for imports ...
I will update thread when I get back to the real life scenario, where I experience larger factor.
Are there any plans for implementing this? Inserting large numbers of rows in EFCore is painful.
@Mikilll94 Can you share the numbers you are seeing using the current EF Core batching implementation?
Implementing https://github.com/aspnet/EntityFrameworkCore/issues/9118 and https://github.com/aspnet/EntityFrameworkCore/issues/10443 might be enough to make the difference much smaller
Note #27372 which will substantially improve our standard insert perf on SQL Server in 7.0.
We also have #27333 for adding a dedicated API for bulk importing (e.g. SqlBulkCopy); we can probably close this as a dup of that.
One last idea... We also have #9118 for optionally not tracking after SaveChanges; which would unlock using SqlBulkCopy for standard SaveChanges. At that point, the advantage of a dedicated bulk import API (#27333) becomes bypassing the change tracking machinery, which may or may not be worth it.
EFCore.BulkExtensions as of recently has BulkSaveChanges as well, among other CRUD Bulk ops.
@borisdj yeah, we're aware of EFCore.BulkExtensions. For bulk update/delete, EF Core 7.0 is introducing ExecuteUpdate/ExecuteDelete (see #795). Bulk import from the client is tracked by #27333.
I know, have been following Execute issue. This one was about BulkSaveChanges, at least by the title, which is a different use case.
@borisdj if you promote your library, be transparent and put a disclaimer in all your post saying it's not free + link to your pricing https://www.codis.tech/efcorebulk/. I lost my time by trying your solution... I had to undo my code because our company doesn't want to pay $1000 for your license!!!
License was recently updated from fully free to semi free and the post is much older (plus the issue is closed) Also it is clearly stated (condition both for free and paid usage) in the project in LICENSE.txt as well as in the README. And I see you have written the same comment on another issue https://github.com/dotnet/efcore/issues/4526#issuecomment-1447020995, no need to spam.
Currently inserting thousands (1000, 10000, 100000) of records is really slow. I think there is room for improvement, so I am creating feature request for bulk SaveChanges.
Inspiration came form (EntityFramework Extensions).
In my case, the factor between
SaveChanges
andBulkSaveChanges
is even greater then:I think this is somehow related with #9270 and #795, but not duplicated. (I might be mistaken.)
Use case: