2nd level caching (database layer). Move away from Entity Framework

mariannk commented 8 years ago

EF affects performance because slow it still doesn't support 2nd level caching (cache entities between HTTP requests). We have to find some workarounds such as custom entities for caching, model caching, etc. It could make nopCommerce really very-very fast

And looks like EF team doesn't plan to support it. Everything will be fine when EF start supporting 2-level caching. But there's no any estimate.

So we can consider other ways to go. For example, use Dapper instead of EF. Or we can fully migrate to it along with linq2db (https://github.com/linq2db/linq2db)

arvand commented 5 years ago

@Leftyx I just saw your post and I may be missing something but did you set 'LoadAllLocaleRecordsOnStartup' to true? If the property is set to true, it will read all the values with one query and with the help of Redis the site will be even faster.

Leftyx commented 5 years ago

@arvand I have 300K products in 350 categories in 4 different languages. Loading too much on startup blows up the server's memory. I had to disable redis, actually, cause things were much worse in terms of performance. This is something other people have complained about. NopCommerce does not store a product in cache but each attribute for the product. That means hitting the cache like crazy an awful lot of times. I had to get back to memory cache.

clearbucketLabs commented 5 years ago

@arvand I have 300K products in 350 categories in 4 different languages. Loading too much on startup blows up the server's memory. I had to disable redis, actually, cause things were much worse in terms of performance. This is something other people have complained about. NopCommerce does not store a product in cache but each attribute for the product. That means hitting the cache like crazy an awful lot of times. I had to get back to memory cache.

Yeah, this won't work....and it's rather simple, i mean you just have the stored proc join the language for those items..... you pass the language to the stored procedure then it spits the product out in that language already, no need for cache, memory usage or other BS and would be blazing fast. Same goes for categories and product url/slugs.... why are they all loaded N+1? Makes no sense. plus product queries or cart etc... still hit the DB, it's just the navigational properties that get cached or the language and they run out at the same time, not to mention it still has to rebuild the objects. Excluding prices, i would be caching the whole view..... i've also tried caching whole category view per category for the first 3-4 pages.....this excludes any filters.... but it speeds things up 100x on high traffic sites where most people land there.... with the worse case someone does a filter and it has to query the DB again.

Categories should also be stored as nested sets, not parent > child... look at the recursion going on in this thing.... also product counts need to be stored separate per category as like a cache.

arvand commented 5 years ago

@arvand I have 300K products in 350 categories in 4 different languages. Loading too much on startup blows up the server's memory. I had to disable redis, actually, cause things were much worse in terms of performance. This is something other people have complained about. NopCommerce does not store a product in cache but each attribute for the product. That means hitting the cache like crazy an awful lot of times. I had to get back to memory cache.

The problem you are raising is huge. I could cache everything and still there were performance issues due to nested queries and EF performance, and EF DBcontext is passed everywhere.

Leftyx commented 5 years ago

@arvand Most of the problems are related to the implementation and there's not much we can do about it. I have spent a couple of months digging into the code and changing stuff, especially in the way related entities are fetched. Performances have improved quite a lot but there's still so much to do. I have managed to reduce the queries roughly 50% but the profiler still tells me I have a few hundreds of queries running. To fix those issues the product needs a major refactoring. Ditching those nasty stored procedures, ditching Automapper, changing IoC container with a faster one, introducing Lucene or Eleastic Search for searches and maybe using real specialized view models so that the queries would be slimmer. Caching objects and not fields would be another major boost even if I am scared when to many things are cached.

ivanslater commented 5 years ago

I believe we must dig on that and improve it. If you need help, Im here too. There is a lot of other powerful ecommerce systems faster, faster, too much faster than nopCommerce is now on 4.10...

arvand commented 5 years ago

@Leftyx The Automapper and IoC container are controversial topics and I don't think we can discuss them here. I'm not sure where to begin but I think at the first step we need to create DTOs and a Data Access Layer which returns DTOs and not EF DBcontext and correctly implements the repository pattern and UOW, I think this repo is a good starting point: https://github.com/urfnet/URF.NET. After that it is possible to add an improved and customizable caching system which returns objects.

Leftyx commented 5 years ago

@arvand The repository pattern is dead and I am more than happy for that. Loads of respectable people have wrote about it. It was popular 15 years ago.

Entity Framework, nHibernate or other ORMs are repositories/unit of work so, wrapping them with another layer, it's just a waste of time and makes things slower.

For my experience one service layer with an injectable DBContext/ISession is more than enough. These days I use some sort of CQS/CQRS mediator pattern (I used MediatR) with commands and queries and things are slim and fast.
You can decorators and pipelines for validation so you always know where to find things.

It's also good pattern if you want to move to microservices too.

I personally don't like EF; I have used NHibernate for many years and, even if is not a fast ORM, it is very flexible and it is easily optimizable (2nd level cache etc etc).

I guess an Ecommerce product could take advantage of a NoSql database.

Anyway, we could spend days arguing about the best solution. I tend to agree with Ayende and his philosophy of "keep it simple". He wrote about NopCommerce too. I agree with everything he said about the product.
10 years ago some patterns emerged and devs started to follow them blindly. I was one of them. Repository pattern, Automapper ... you name it. Everyone was doing it and I did.
It was "cool" but debugging that sort of code is a nightmare.

sdanyliv commented 5 years ago

My vote that repository pattern is bad choice for modern .NET software. LINQ is the best abstraction over Database ever created. And covering it by another abstraction is wasting time, money and for sure performance.

I’m always claiming about that when people create issue for linq2db and mention repositories (usually generic). If you need fast application, write simple code and find only things that needs to be reused. Think in terms of LINQ queries, not a list of objects or their caches, and you will find how it is easy to write effective applications, for sure if you know SQL :)

Automapper - i do not need that because i do not create automatic Entity -> DTO mapping. All that you need is correct LINQ query and final projection to DTO in controller. Frequently LINQ query will be reused, but mapping varies. Because you just need several fields from complex entity.

Also do not forget about bulk operations - they have to be present in ORM. Shame to EF Core, which even do not plan to support them. All that i saw it is third party extensions which works in dramatically limited cases and become broken after another EF Core release. So from start developers omit effective code because it is not supported - it is a key how to lose performance in whole application when SQL server resources is over.

NoSql - it is last solution that i will choose, because you may loose flexibility and for reporting you will need additional solutions like Spark or expensive BI platforms.

I had take a look at code which works with database in this project - it can be improved dramatically, so NoSql is not an option. Even more if it will use only linq2db - all stored procs can be rewritten by LINQ queries, current code just workaround for EF limitations. Look at his one: https://github.com/nopSolutions/nopCommerce/blob/develop/upgradescripts/4.10-4.20%20(under%20development)/upgrade.sql#L869 I’m pretty sure that i can write more effective LINQ query for that.

Leftyx commented 5 years ago

My vote that repository pattern is bad choice for modern .NET software. LINQ is the best abstraction over Database ever created. And covering it by another abstraction is wasting time, money and for sure performance.

I’m always claiming about that when people create issue for linq2db and mention repositories (usually generic). If you need fast application, write simple code and find only things that needs to be reused. Think in terms of LINQ queries, not a list of objects or their caches, and you will find how it is easy to write effective applications, for sure if you know SQL :)

Automapper - i do not need that because i do not create automatic Entity -> DTO mapping. All that you need is correct LINQ query and final projection to DTO in controller. Frequently LINQ query will be reused, but mapping varies. Because you just need several fields from complex entity.

Also do not forget about bulk operations - they have to be present in ORM. Shame to EF Core, which even do not plan to support them. All that i saw it is third party extensions which works in dramatically limited cases and become broken after another EF Core release. So from start developers omit effective code because it is not supported - it is a key how to lose performance in whole application when SQL server resources is over.

NoSql - it is last solution that i will choose, because you may loose flexibility and for reporting you will need additional solutions like Spark or expensive BI platforms.

I had take a look at code which works with database in this project - it can be improved dramatically, so NoSql is not an option. Even more if it will use only linq2db - all stored procs can be rewritten by LINQ queries, current code just workaround for EF limitations. Look at his one: https://github.com/nopSolutions/nopCommerce/blob/develop/upgradescripts/4.10-4.20%20(under%20development)/upgrade.sql#L869 I’m pretty sure that i can write more effective LINQ query for that.

@sdanyliv I agree 1000% on what you say. It seems you know your s**t ! Bulk operations ? Yeah! I have implemented some imports for my 300K products trying to reuse most NopCommerce functionalities. I can only say the first import takes more than a day to finish. I am not even joking. At the end I moved that bit to Dapper, just cause I could fully use transactions, without tracking. Another weird thing in NopCommerce is this lack of transactions. This useless EfRepository saves changes all the time. What's the point of that ???

So, in NopCommerce we have ModelFactories which call a few services which use an abstraction for a repository (without transactions) which return a fully loaded object mapped with Automapper, finally transformed in a view model passed to a controller, eventually transformed again according to what data a view needs. Inside the service layer of course there are settings and localization services which fetch each property of an entity one by one, put in a cache, if it is not there. And people here think that moving to Dapper would make a real difference.

Good luck with that.

clearbucketLabs commented 5 years ago

Worse of all... is when a bunch of objects are modified or in the context before you want to save changes and you call the logging.insert...... they all get saved, who would of knew. this is a cluster fuck. events are even passed the EF entity too, so who knows what the 3rd party plugins do. this is not developer friendly, it's completely unpredictable.

sdanyliv commented 5 years ago

@sdanyliv I agree 1000% on what you say. It seems you know your s**t !

@Leftyx, I'm not a native speaker and I hope you are not connecting me with this project :) I have spent almost 3 years in improving linq2db and I know how to beat performance issues in database projects.

Bulk operations ? Yeah! I have implemented some imports for my 300K products trying to reuse most NopCommerce functionalities.

It's weird, with bulk operations it should took several minutes. Just had a project when I have opened, via linq2db, two database connections and transferred 600K+ records with transformations in less than 5 minutes.

At the end I moved that bit to Dapper, just cause I could fully use transactions, without tracking. Another weird thing in NopCommerce is this lack of transactions. This useless EfRepository saves changes all the time.

Just want to say that I'm almost never write raw SQL queries, linq2db do it for me more effectively. Queries are optimized and non relevant joins are removed after query decomposition. And we also do not respect change tracker. It is useful in... well, I don't know when. Start transaction, insert, update, delete, commit transaction as fast as you can do that.

So, in NopCommerce we have ModelFactories which call a few services which use an abstraction for a repository (without transactions) which return a fully loaded object mapped with Automapper, finally transformed in a view model passed to a controller, eventually transformed again according to what data a view needs.

It is one of the many problems using repositories and automapping. For me repository is anti-pattern "that can be quickly tested/mocked". (facepalm), it is real world and everything needs to be tested on real database with stored procs, with bulk operations. So, why repositories are needed? BTW, In this project repositories are NOTHING and just wrappers - someone famous wrote article that this is good coding style. So they've got additional memory allocations and not needed complexity with unpredictable side effects.

AndreiMaz commented 5 years ago

Hi everyone!

Most probably (99%) we'll implement this task in the next version of nopCommerce. If you already have something implemented and could share some of your source code, it could be very helpful for our team!

arvand commented 5 years ago

The problem with EF Dbcontext is that although it implements UOW in itself there is no way to add a new query to it, so at the end you need to wrap it because it is possible to add other queries with a transaction and also the service layer won't have a dependency on EF dbcontext so it will be possible to replace EF with other ORMs. Please have a look at URF.NET repo and how they created it.

@sdanyliv I do agree that generic repositories can be antipatterns but not repository pattern. Linq2db can be used within a repository pattern. Also, a Repository implementation that has leaky abstractions (exposing IQueryables) is a poor design. If you favor query objects over repositories and using MediatR, that's a different story. @Leftyx, Of course, your bottleneck is in the caching layer and none of these would solve your problem.

Also the purpose of the service layer seems to be lost. The service layer should be for when we have an operation that doesn't properly belong to any aggregate root but when it has access to DBcontext these things happen. Quoting from the answer to How accurate is “Business logic should be in a service, not in a model”?:

In DDD, services are meant specifically for the situation when you have an operation that doesn't properly belong to any aggregate root. You have to be careful here, because often the need for a service can imply that you didn't use the correct roots. But assuming you did, a service is used to coordinate operations across multiple roots, or sometimes to handle concerns that don't involve the domain model at all (such as, perhaps, writing information to a BI/OLAP database).

One notable aspect of the DDD service is that it is allowed to use transaction scripts. When working on large applications, you're very likely to eventually run into instances where it's just way easier to accomplish something with a T-SQL or PL/SQL procedure than it is to fuss with the domain model. This is OK, and it belongs in a Service.

This is a radical departure from the layered-architecture definition of services. A service layer encapsulates domain objects; a DDD service encapsulates whatever isn't in the domain objects and doesn't make sense to be.

Leftyx commented 5 years ago

@arvand: as I said I use a CQS approach with MediatR. It fits best they way I write code: I know where to find things.
A command InsertProduct and its handler InsertProductHandler (and the decorator InsertProductValidator) tell me exactly what they do. I don't need to find any other bits and pieces of code spread in other places. I am against "anemic models" and I agree, some business logic must be in the aggregate root. Sometimes business logic is not only related to an aggregate but traverses multiple entities. Imagine a shopping cart and the payment. Where would you put all that logic ?
With a Mediator pattern you would have CompletePayment/CompletePaymentHander and everything is in there.
If you don't use that pattern I guess a service is a good fit.
Anyway, patterns are good but, as most experts say, first of all check if you need that level of complexity.

Repository implementation that has leaky abstractions (exposing IQueryables)

You keep on talking about repositories. Why would you do that in the first place. Again, a view model returned to a view might be using multiple entities: Products, Stocks, Carts and also external references: imagine an external endpoint you might use to check availabilities (complex systems). It's not always possible to hydrate ‎a view model with a simple query, hence the reason to create a wrapper, a service or a query handler, where you would manage a transaction, eventually.

I don't have any problem in passing a DBContextto my service; do you know why I would not create a repository ? Cause DbContext is already a Unit of Work. In the past people argues you must use a repository cause at some point you might want to replace the database with a different technology. It's a good point but I don't think it would be less complicated than replacing a different DbContext/ISession in your service/command/query. PS: I have done this!

arvand commented 5 years ago

@Leftyx We are talking about two topics here: 1) Moving the whole architecture to CQRS. I don't know how many hours that takes and whether it is feasible at this point or not. But I do love to see nop with CQRS architecture. 2) Refactoring the repository to support other ORM. Ef DbContext does implement UOW. But try to add a query (like a bulk update that ef doesn't support) to DBContext, you can't! Now you need to wrap it with a transaction but your UOW doesn't support that. So now you need a new UOW. And by passing DBcontext instead of an interface you are beating the whole purpose of DI, which will result in a service layer being dependent on EF (and also having access to queryable objects.)

Leftyx commented 5 years ago

@arvand: I think you misunderstood me a little bit. I am not talking about moving to CQRS or replacing EF or supporting other ORMs. Never said that. I have commented this thread in reply to people thinking that replacing EF with Dapper would change things dramatically, which I believe is total nonsense. My suggestion is a lot simpler:

Get rid of old-fashioned useless repositories (which they don't do much, by the way, except using memory and slow down the DI container)
Get rid of factories (those methods should be in the service layer, not in the web layer) => tests and maybe moving the logic to a headless service.
Implement real transactions (this should be easy if you get rid of repositories and move all logic to services)
Refactor the cache, storing an object (IE a model of the product, instead of all its attributes/fields)
Refactor the LocalizedEntityService/LocalizationService: those services should fetch all the properties for an entity instead of multiple query for each field.
Use EF has it should be used: if I save the Cart it should propagate the changes to the Cart Items (transaction) instead of saving each item one by one, not even in a transaction.

Not a lot of changes but a real boost in performance.

PS: I have refactored the LocalizedEntityService/LocalizationService as I am suggesting here and I have saved more than 100 queries on the first run.

springmin commented 4 years ago

+1 for Dapper

mobsoftware commented 4 years ago

What about EF Core + Cache Extension as mentioned on EF Core's GitHub repository?

https://github.com/aspnet/EntityFrameworkCore/issues/5858#issuecomment-382902552

The nopCommerce community is already familiar with EF so it would be advantageous to stick with the Entity Framework. It would save on development cost/time as well.

atulrungta commented 4 years ago

Hi @AndreiMaz ,

Are you working on this issue? Are there any commits related to this issue in repo?

a-patel commented 4 years ago

@atulrungta Branch: https://github.com/nopSolutions/nopCommerce/tree/issue-239-ef-performance ORM Library: linq2db (Removed EF Core)

samdubey commented 4 years ago

You won't regret if you choose this architecture for database access

https://github.com/volkanceylan/Serenity

It uses Dapper and FluentMigrator Exceptionally well architected code and insanely fast.

Please have a look at project for our NopCommerce (when thinking about moving away from Entity Framework)

AndreiMaz commented 4 years ago

And done. We don't use Entity Frameword anymore

Please feel free to get the latest "develop" branch and play with the new approach.

atulrungta commented 4 years ago

I checked this and found some error on tier prices pop-up. But this was really wanted. Much appreciated :)

krispetkov commented 4 years ago

I will be busy next few days to test it but if someone makes a performance comparison between the old way - 'using the EF' and this new approach, please share your results 🙏

crytopean commented 4 years ago

My question would this allow us to now build plugins that add new columns to existing base tables and extend existing classes with those new properties?

I know with EF we couldn't really do that coz of the context ... I don't recall if that was ever fixed .. I vaguely remember some work around or tweak that worked around that... But it's been a minute since I even tried to do that.

nguyenhthai commented 4 years ago

I have a few questions on how this will impact the plugin development.

First, my plugins currently utilized FluentMigrator. Since this includes it by default, will it allow plugin devs to customize the migrator runner. For example, each plugin needs to use different migration version table to track migration history, how can I config that within the plugin? At the moment, I hooked into the PluginStartup and build the migrator runner service there for each plugin. Will it possible to do the same in the new version of Nop?

Second, is there any plan to incorporate Azure managed identity out-of-the-box? I believe this https://github.com/nopSolutions/nopCommerce/pull/4178 allows to use it with the DB context but we are getting rid of that for linq2db (which is the library that will replace EF core). The new library does support managed identity but in order to configure that, you would need to have it passed in the connection factory. However, the problem is that FluentMigrator still not support managed identity by default and you need to write custom processor in order to do that at the moment. So will Nop going to support it in this new version and allow plugin dev to use it eventually?

AndreiMaz commented 4 years ago

And done

nopSolutions / nopCommerce

2nd level caching (database layer). Move away from Entity Framework #239