ardalis / OrganizingAspNetCore

Offers several different ways to organize content in ASP.NET Core MVC and Razor Pages projects.
http://ardalis.com
MIT License
209 stars 60 forks source link

Repository pattern should not be used by default #1

Closed jarrettv closed 7 years ago

jarrettv commented 8 years ago

I've noticed here and elsewhere you preach using the repository pattern IRepository on top of IDbSet which is already the repository pattern. Sometimes it is necessary to put abstractions on top of abstractions but those decisions shouldn't be taken lightly.

ardalis commented 8 years ago

I recommend practices that result in clean, loosely coupled code. One such practice is depending on interfaces that you control, rather than third party vendors, for your domain model. It takes virtually no effort to create a small interface with 1-5 methods to abstract access to data, and uses only POCOs (so no dependency on a particular data access technology). If instead you rely on IFoo from vendor Bar, which exposes types also owned by vendor Bar, it's generally much more difficult to change later, and frequently to unit test as well.

jnm2 commented 8 years ago

Also: if you wait till you need it, which is in general a good philosophy, it can be difficult to factor in later. Especially when architecture starts settling around the vendor pattern. I use simple repositories to abstract 1) EF data access, 2) large-scale bulk insert, 3) caching, and 4) in-memory testing.

The general idea is that domain logic depends on repositories, not situation-specific data access. That way domain logic can get entities without knowing where they come from. It doesn't know or care if I'm backing it with a database, whether I'm batching or caching or testing.

ardalis commented 8 years ago

Also good points. I forgot to mention that using your own interfaces enables caching patterns like this one: http://ardalis.com/building-a-cachedrepository-via-strategy-pattern

And suppose in this example I decided EF wasn't the best option and I wanted to just use a micro-ORM like Dapper (https://github.com/StackExchange/dapper-dot-net). Doing so with my dependency only on my own type is trivial compared to ripping out vendor-specific interfaces littered throughout my program's layers.

mcquiggd commented 8 years ago

@ardalis

Seconded... personally I tend to create a repository, then swap between say InMemory implementation for Tests, MongoDB for local development, then DocumentDB for production.

Just because EF has a pattern that could be construed as a Repository, it doesn't mean that a Repository is redundant.

And applying Decorators to a Repository interface via DI can be very useful, e.g. caching, logging, coarse performance measurements, etc.

jarrettv commented 8 years ago

The general idea is that domain logic depends on repositories, not situation-specific data access.

The domain logic should never depend on infrastructure. If it does, your direction of dependencies will break-down. IQueryable is all your domain needs to know about.

In my experience, 90% of the time, IRepository is YAGNI.

ardalis commented 8 years ago

domain logic depends on repositories, not situation-specific data access

The domain logic should never depend on infrastructure.

These two statements are saying the same thing. In the first one, perhaps it's not clear that he means abstractions/interfaces, defined in the Core of the application, not specific repository implementations. But that's what's meant.

IQueryable shouldn't be exposed outside of your data access code if possible, especially if you're using Entity Framework, because it is a very leaky abstraction. It enables complex querying behavior, including business logic, to be embedded into UI layer code, and can result in code that will compile and work perfectly well with on backing store (say, an in-memory collection) but will fail at runtime when used with EF (because EF can't translate something into SQL). It's another example of how EF can embed itself into your application, escaping from the data access layer where you'd want to keep it if you're following separation of concerns.

jnm2 commented 8 years ago

IQueryable is all your domain needs to know about.

The problem with IQueryable is that it ends up coupling you to vendor-specific infrastructure quirks. I know in theory it sounds perfect, but in practice, IQueryable forces you either to:

  1. Make your domain model also double as your persistence model
    • You are forced to lose richness in the domain model in order to make your persistence infrastructure happy
    • The queries you write are very query provider sensitive with special treatment around eager loading, navigation properties, etc. Queries will act differently and even fail between query providers, including the default LINQ to objects provider.
  2. Write code around translating the IQueryable back and forth between models.
    • This is a lot harder than using an IRepository to translate because the possible compositions off IQueryable are endless. You become your own ORM. (If you use AutoMapper, you have another set of infrastructure quirks to worry about. AutoMapper was not designed for this, according to its author.)

IRepository solves this by presenting a more limited surface area than IQueryable. If you depend on a well-designed IRepository, you're depending on less than if you depend on IQueryable.

As far as getting information, IRepository as an interface is thus not infrastructure but it is a simpler IQueryable. Of course, it might be implemented by infrastructure classes but the domain model will never know that.

As far as saving changed aggregates or adding newly created aggregates, IQueryable doesn't help you unless yet again you use vendor-specific infrastructure-specific extension methods.

In onion architecture as I understand it, the application layer is responsible for injecting the proper infrastructure into the domain. That's the application layer's buffer between infrastructure and domain. The domain doesn’t know what infrastructure it's using, it only knows how to call it via a common non-infrastructure-specific interface (IRepository).

jnm2 commented 8 years ago

If I had seen @ardalis's post I wouldn't have gone into so much depth- good summary.

@jarrettv Here's something to consider to back up the idea that IRepository is a domain interface, not an infrastructure interface. In good SOLID architecture, interfaces are owned not by the implementers but by the clients of the interface. The client programs to the interface, whose sole purpose is to fit the client's needs. IRepository was created out of the responsibilities of the domain services, not out of the needs of the infrastructure services.

jarrettv commented 8 years ago

Wow great conversation here. It would be fun to have a friendly YouTube debate on the topic.

There is a buffet of patterns in DDD and when you are loading your plate up to solve a problem, there is no rule saying you must take this (IRepository) and that (IDomainEvent).

You offer good arguments on why IRepository is useful, but none of these reasons were driven by a business need.

When organizing your solution, there will always be compromises such as:

Make your domain model also double as your persistence model

My argument is evaluate the business problem you are trying to solve and then decide what patterns are necessary. This was my beef with the Dependency Injection page on docs.asp.net. (It looks like someone cleaned it up)

For low to medium business complexity, I find IDomainEvent and IQueryable good enough to isolate my domain from the infrastructure.

IQueryable shouldn't be exposed outside of your data access code ... because it is a very leaky abstraction

IQueryable is a useful compromise for dealing with large aggregate roots. The alternative is more code and more complexity. If you can't tell by now, I lean towards less code is better. 💪

IRepository is a domain interface, not an infrastructure interface

Persistence is infrastructure. Business doesn't and shouldn't care about CRUD or SaveChanges. Let that infrastructure stuff stay in your controller/handler.

the application layer is responsible for injecting the proper infrastructure into the domain

IMHO, there shouldn't be any infrastructure in the domain. Furthermore, I rarely see the need for domain services -- especially as we strive for bounded contexts thru micro-services.

ardalis commented 8 years ago

In any substantial codebase, I doubt based on my experience that your approach would, in fact, yield less code overall. If you let EF and IQueryable crap all over your UI layer, you're going to end up with data concerns, business querying logic and specification logic, probably caching logic, and more intermixed with the proper responsibilities of the presentation layer. If you ever need to query a set of entities twice in similar ways, you'll end up with that duplication in multiple controller action methods or UI event handlers. If you were using an interface to abstract that need, you would likely identify it as a common requirement and push the common querying logic into the interface (perhaps as a new method on a repository interface). This is far less likely if you just say the UI layer is the wild west and infrastructure concerns are fair game there alongside UI and domain objects.

You can write applications this way, and they'll work, but they're likely to be more expensive to maintain in the long run. The UI layer is also likely to violate SRP, OCP, DRY, etc. and if you're not bothering to abstract out data access you may just opt not to use abstractions (i.e. interfaces) at all, in which case you could argue you're violating ISP (though it's a stretch if you're not actually using any interfaces). I have clients who pay me to fix this kind of code. They're generally happy with the results once I show them the benefits of applying these principles and the ease with which the resulting codebase can be maintained and modified.

By all means, start out without IRepository. But follow Separation of Concerns. Follow DRY. Follow SRP and ISP and DIP. And see if there is a better way to keep data access from polluting other, unrelated aspects of the application. I've not found one, which is why for non-throwaway applications I stick with abstracting it away. You know on average Microsoft ditches its current data access stack about every 18-24 months (for the last 20 years or so). If you think your application is going to live longer than that and might need to use a new data access stack, it's really not that hard to code to an abstraction rather than hard-coding dependencies on implementation details everywhere.

valdisiljuconoks commented 8 years ago

Would it be more accurate to mark ApplicationDbContext and EfRepository internal and program only against IRepository<T>? Then I just wonder what would be the correct way to map abstractions to implementations in Startup class when registering services if context implementation and EF repository would be located in different assembly. Play with InternalsVisibleTo?

ardalis commented 8 years ago

More accurate? I'm not sure what you mean by that.

If you do have dependencies in other projects/assemblies that need wired up at startup, you can use capabilities of third party containers to achieve this. For instance with StructureMap you can have a Registry class in the separate project (and thus with access to internal types there), and then configure the container to use this registry at application startup. Does that help?

jnm2 commented 8 years ago

In my domain model assembly I'll have the domain's client interface definition, IRepository, but no reference to my persistence assembly. In my persistence assembly, I'll have a reference to the domain model assembly to implement IRepository and to return domain objects. My client assembly (the one with Program.Main and application logic) has references to both assemblies. I have my composition root here (I use Castle.Windsor) which registers concrete public repository classes from the persistence assembly for the IRepository interface. If you aren't using an IoC container, you would have the application assembly create instances of the concrete persistence assembly repository classes and pass them to the domain services that only know about the IRepository interface.

InternalsVisibleTo is not a good design here in my opinion.

valdisiljuconoks commented 8 years ago

So if I'm using pure DI, then composition root (located in client's assembly) would need to have access to precise type of repository class from the persistence assembly anyway (e.g. EfRepository<T>). Either I would need to open it up for "public" access or use internals visible to.

Of course IoC containers would come and help me to discover registry in persistence assembly and map repositories to interfaces.

Just thinking in more theoretical level - for instance in ports & adapters architecture cases. When repository implementation is physically "outside" from domain assembly. Does it really matter to protect repository implementation then?! Then assemblies really become only as packaging mechanism and not used for encapsulation.

What are your thoughts and experience?

ardalis commented 8 years ago

It's convenient if the composition root/client assembly has access to implementation types in the infrastructure project(s). But it's not required. I blogged about a simple way to use the types without such references here: http://blog.falafel.com/use-types-from-project-without-referencing/. If that's important to you, and/or if your team lacks the discipline to not use types from Infrastructure in the client project, it can be a good approach to maintain the encapsulation you're looking for, IME.

valdisiljuconoks commented 8 years ago

Think you are right regarding defensive approach - it's more question about team self-disciple to use types from assemblies "allowed" not from assemblies "available".

Thanks all!

ardalis commented 8 years ago

If you do keep the references, you could also use a tool like NDepend to detect usage of Infrastructure types from the UI/client project.

jnm2 commented 8 years ago

@ardalis It looks like I have something to learn. I wasn't aware that there was a great benefit to preventing the UI client from using infrastructure classes; in fact I thought it was its responsibility. For example, caching is applied some places but not others, and batching is used some places and not others. I'm accomplishing this by having the application logic use new CachingRepository or new BulkMergeRepository instead of the default repository type.

Is that type of access harmful for the application logic? How else does the application logic get to decide when batching and caching are used, or what is responsible to decide if not the application logic?

ardalis commented 8 years ago

@jnm2 I do think it's best if the front end/UI only works with abstractions, not implementations, and thus doesn't directly use types in Infrastructure. To answer your specific question, that's simply a matter of choosing which collaborators your system will use at runtime. Usually this is the responsibility of the IOC container. Have a look at my articles on the CachedRepository pattern to see how you would do this.

http://ardalis.com/introducing-the-cachedrepository-pattern http://ardalis.com/building-a-cachedrepository-via-strategy-pattern

After reading those, please reply back here with what you think.

valdisiljuconoks commented 8 years ago

Regarding the caching - I still think that it's mostly pure infrastructure cross-cutting concern and composition root might decide where to put caching or not. Original repository should not be aware of caching as such. And also - I would look for using decorator api - this way you can simply write:

For<IRepository<Ninja>>().DecorateAllWith<RepositoryWithCache<Ninja>>();

But I guess it's just a matter of taste - idea does not change.

jnm2 commented 8 years ago

Hmm. So the decision whether to cache or batch is made in the wiring of the composition root, if you have one.

  1. How would this work if in one application service or view model I wanted caching and in another I did not, for the same entity?
  2. What about a client that doesn't currently use an IoC container?
  3. In my scenario, batching is more specialized. Nothing happens when entities are saved. A BulkMergeRepository-specific method is called to flush. What abstraction should be used here? IDisposable isn't quite right. I don't see value in abstracting this anyway because using a normal write-through repository performs terribly and this other client is a command-line utility. Is this a poor design?
ardalis commented 8 years ago
  1. You can always specify in your IOC what types a particular type wants to use. I rarely do this, but it's supported. So if you have one service that wants a cached repo and another that doesn't, you can specify that. If you wanted to specify it elsewhere, like in the service itself, one option would be to use marker interfaces. ICachedFooRepository could inherit from IFooRepository but not change its signature at all. Services could request ICachedFooRepository as a way of specifying they want cached behavior.
  2. Clients that don't use IOC that need to work with services that expect DI must simply instantiate the dependencies manually.
  3. This sounds like a UnitOfWork pattern. Are you familiar with it? Batching and Committing are UoW concerns and could be modeled that way.
jnm2 commented 8 years ago
  1. I would be afraid that would introduce behavioral coupling between the wiring code and each particular type, because each type knows whether it wants caching. Marker interfaces are a good solution; in fact IBulkMergeRepository is also a good solution. I'm happy with this.
  2. Sounds good.
  3. I probably don't have a practical enough understanding of UnitOfWork yet. The DDD ideology as I understand it states that transactional boundaries should be on single aggregate roots, so IRepository.Save() already implements the only transaction I should be allowed to use. Except for this bulk merge case. In this case the atomicity is not the aim, the aim is only the practical difference in speed between SqlBulkInsert and Entity Framework.
valdisiljuconoks commented 8 years ago

From my experience - it was OK enough to introduce this ""behavioral de/coupling". Made services/view models dumb enough to not to care about caching - but instead moved that responsibility to the composition root (DI registries in my case). What I personally liked about this approach - is flexibility you get. When you decide or forced to implemented caching in the service or view model, the only place you need to change this behavior - is composition root. Reset of the services are not affected. Will that make code more readable and understandable - if reading service source code you might be wondering - is this guy using cache or not? I guess Mark has already answered that.

ardalis commented 7 years ago

This was a great discussion but I'm closing the issue now.