dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.63k stars 3.15k forks source link

Remove the in-memory provider #18457

Closed ajcvickers closed 1 year ago

ajcvickers commented 4 years ago

Note: the decision to do or not do this has not been made. Feedback is appreciated.

While the in-memory provider is marketed as "for testing" I think that is somewhat misleading. Its actual main value is as a simple back-end for all our internal non-provider-specific tests.

When we started EF Core, it seemed like it would also be useful for testing applications. This can work with appropriately factored test code and a good understanding of where the provider behavior is different. However, I haven't recommended this for many years, and I don't think anyone else on the team would recommend it either. The main reason is that it's too easy to get caught out by differences in provider behavior, and therefore not realize that the behavior of your test is different from the behavior in production. This can be mitigated by using a provider that is closer to what is being used in production--for example, SQLite. However, the only way to know if something really works is to test with the same code that is running in production.

So, if the only real, non-pit-of-failure approach is to not use the in-memory provider for testing, then maybe we shouldn't ship in 5.0?

On the other hand, it can work for those that understand the limitations, and testing isn't literally the only use for the in-memory provider, even though it is by far (e.g. 99ppfa) the most common.

gfoidl commented 4 years ago

I agree with the rationale. But (as user of in-memory provider*) I'd like to see it in 5.0. To mitigate the concerns about wrong use, one could either

Of course if the cost for maintaining in-memory is too high, it may be better to just remove it. But then there's still the old nuget-packages, old documenation that doesn't prevent someone from using it (with more or less success...).


* for tests where no relations, etc. are involved -- just simple inserts, reads, etc. It is easy to setup and reasonable fast for unit-tests as opposed to SQLite or SQL Server. In production the latter is used, so there are -- of course -- tests for more complex scenarios that run against real SQL Server.

ErikEJ commented 4 years ago

I think it is the right thing to do! We are using Inmemory for our unit testing our micro services, but could just as easily use SQL LocalDB, in fact we already do this in a service where we use SQL Graph.

Always using SQL LocalDB would give us much more confidence in our unit tests, and minimize the amount of "Swagger UI" or integration tests we need to run.

It also seems like you spend a large amount of time making the Inmemory provider work with new features, time which could be well spent elsewhere.

bachratyg commented 4 years ago

Some feedback first. We've been using a custom provider on top of EF6 that mimicks the quirks of a particular version of Oracle while using an in-memory store. The setup is simple: get the DbCommandTree from EF and instead of generating SQL command text interpret it over the in-mem storage. This gives us a huge productivity boost since we could easily detect problems like unsupported query functions (collation mismatch, outer apply, clob vs varchar operations, etc), expressions that EF would not compile and test basic correctness like constraint checks and foreign keys. Even though it's not particularly optimized we can still run hundreds of tests in mere seconds while spinning up an Oracle test database would literally take minutes, not to mention the overhead of maintaining specific versions.

I'm looking for something similar on top of EFCore. Would this be a good use case going forward for the in-memory provider (assuming it does not get axed)? Or would it be better to build it on top of the relational provider base? I'm not sure where to start, any pointers are welcome.

kirnosenko commented 4 years ago

It would be a pity to lose such a great feature. In-memory is much better for testing than localdb. It is faster and allows isolation testing when I can push partially initialized entities into store.

roji commented 4 years ago

For anyone commenting here, consider that the Sqlite in-memory testing option does exist as an alternative to InMemory. In cases where spinning up the actual database is too costly (e.g. Oracle), Sqlite provides a very high level of relational features (e.g. transactions) while still being extremely easy to set up and fast.

stefannikolei commented 4 years ago

Having the InMemory provider makes it really easy for some tasks. In my case I am using it mostly just for local development in the case I have hangfire.io in the app. So I do not have to spin up an sql database (It would not be too hard but in memory is just too easy in that case)

The other are the tests. Here I would not be concerned to switch to sqlite.

AndriySvyryd commented 4 years ago

Also note that even if we no longer ship the in-memory provider, it is still open source. So anyone that really needs it can create a fork and publish it in a different package. This would likely result in more features being added to the provider by the community as the EF team usually prioritizes them lower than the features for other providers.

ErikEJ commented 4 years ago

@AndriySvyryd So "remove" just means "do not publish on NuGet" ?

roji commented 4 years ago

@ErikEJ it would mean that we would no longer be maintaining it, making sure it's compatible with latest EF Core versions, etc. The community would have to be responsible for things like adapting InMemory for any provider-facing breaking changes.

dragnilar commented 4 years ago

I never really use the in memory provider, since it doesn't behave like a true DB. As @roji pointed out, Sqlite is an alternative, and a better one, IMO. Suffice to say, I opted to use SQLite in memory for my tests instead of bothering with the one that is the subject of this issue. You won't see any complaints from anyone on my team if you decide to go through with this change.

jespersh commented 4 years ago

I found that the InMemory provider gave me more issues than it solved and I ended up today going towards the SQLite provider for testing. But even then the SQLite provider gave one glaring issue when running in memory too and that was I couldn't run EnsureDeleted between each test.

What I'd suggest is to point the InMemory provider to the SQLite in a "in memory" configuration and a EnsureCreated run (maybe?). It might give more value for some this way looking for a quick setup before running into caveats.

generik0 commented 4 years ago

@jespersh i moved the the SQLite memory today also.l, with EfCore 3. I had an issue that my SQLite was not auto migrating because one needs to open the connection for SQLite first... The SQLite worked for the most part. I had an issue with the int64 with EF bulk. But I changed to the LocalDb for that test and that’s life I guess.

I don’t know what you are using to test , but I use NUnit. I have a new context per test fixture or after a dispose in the test. The context is shared via thread static place holder for context. I know thread static is not good for applications, but I think it is fine for the test :-)

Ie so yes, I make a context with ensurecreated and hence auto migration at least every test fixture.

Now that I have gotten though and fixed all the issues / breaking changes from efcore and “updated” to SQLite memory with auto migration, I actually feel like I have a more solid application and ORM usage.

Many things I didn’t know where running client side have been pointed out, and improved to run server side... so this is very positive:-) I will also sleep better at night, I spent a long time “arguing” with team about the importance of testing the repository methods. And if I had not that, it would have been HELL to find all the breaking changes / issues. Now we found and fixed them just by running the test. Lovely ;-)

jhartmann123 commented 4 years ago

I really would hate to see this gone as the InMemoryProvider gives us so many benefits, not only from the obvious testing speed improvements, but also from the scripting/maintenance side for our build pipeline.

Unit tests:
In most cases it gets obviously used for unit tests, also by other companies. We are well aware, that using the InMemoryProvider does not show the issues with the queries, like client evaluation-errors, relational issues and the like. However, in our unit testing framework you can set the database-provider per test. You just mark it wiht [PostgreSqlTest] and voila, you've got a real PostgreSql-DB backing the test. So we normally have one or two tests that run with a real DB to check if the queries bascially work, and all other tests are fast InMemory-Tests. For the nightly builds we turn a switch where all InMemory-Tests also get a real PostgreSql-DB. We see the remaining issues there and still have the speed benefits during the day.

Maintenance benefits:
We also use it for other tests in the pipeline, mostly for frontend tests. The frontend has some end 2 end tests and tests which try to find accidental UI changes using BackstopJS. Both need a running backend. The backend gets started by the CI runner (in a docker container) - but it does not need a real DB in the background. The queries got already tested in the unit tests. So for those tests we just start the backend with an InMemory DB. The speed doesn't matter there, but I have to count in much less maintenance code - I don't have to create a test-DB, drop it again, remove zombie DBs when the pipeline was manually stopped and so on.

Starting a new project:
Thinking back when we started a project where docker and postgres was new to me, the InMemory-provider helped me a lot prioritizing some tasks - I could tell the frontend devs that their files get served, but there isn't a DB yet. We could start with the project more in parallel. Otherwise I'd have to work much more overhours to get the postgres infrastructure up and running in time.

Alternatives:
LocalDB isn't one. It doesn't tick a single box. We use Postgres and the migrations are targeting that. We use docker and linux. Not sure about the compatibility there. Also we don't get the speed benefit and would still be required to do all the cleanup work. I'd just use a normal Postgres-instance then.

SqLite InMemory: tried that, doesn't work. I tested it on a smaller project with about 500 unit tests. I ran all tests with EF Core InMemory, with SqLite InMemory and with a real Postgres-DB.
I can live with the slower speed (about 32secs vs 16secs pure InMemory-tests).
I can't live with the issues it brings. I get client evaluation exceptions that I don't get with Postgres, I get syntax errors (from normal linq queries!), I get exceptions due to unsupported features (SQLite cannot order by expressions of type 'DateTimeOffset').
I simply do not want to fight with a database that we never use.

Conclusion:
We're all aware of the issues the InMemory-provider brings and because of that, we can handle them properly. The awareness of the issues is a pretty important part. But the benefits it brings to us are so much bigger - I definitly have to vote for a continued support on that side :)

davidroth commented 4 years ago

Please don`t remove it. I fully agree with @jhartmann123 his comment. If you know its limitations, InMemory provider is a very easy to use and extremely fast solution, for running in-memory tests. SqLite has its own sets of problems and is considerable slower (ex.: DateTime issues, NodaTime issues, and others). Running all tests against a real db is too slow for rapid pull-request based iteration workflows with hundreds / thousands of tests and not really necessary.

Most of our code is well factored in simple cqrs style handlers where we can very easily decide if a given handler can be safely tested in-memory, or if we need a full db test. We decide on a per-test basis.

If there is lots of advanced query stuff or db specific query code to test (transactions, sequences, locking behavior, ...) we can run the specific handler test with a real db engine in the background.

However, from my observation, most handlers in our apps use very basic ef queries (single, where, or basic joins), where we know from experience, that the InMemory provider works reliable enough and is fast. We mostly want to test the business logic of the handler code in combination with selecting the data while using ef with its basic repository, uow and identity tracking behavior. The queries which provides the data for the remaining handler code are in many cases so basic, that there is no need for a real relational provider/db. The speed outweighs the potential (low) risk, for missing an error because of not using a real db (our experience in our projects).

Beside from the testing advantages, InMemory is a great tool for experimenting with EF-Core behavior in a very fast REPL-mode.

I can count endless of times, where i quickly opened a linqpad session, created some DbContext+Model code to experiment with a new idea/logic, or generally test some EF features. InMemory provides me this rapid feedback mode without fighting SqLite db store limitations (ex. DateTimeOffset support etc.). Yes, you can workaround some of these limitations by using value converters or other mechanisms. But it makes things much more complicated than needed.

So strong +1 from my side for keeping the very useful in-memory provider!

grendo commented 4 years ago

Since version 3 the Mode=Memory does not seem to work any more. I make extensive use of it as an in mem cache that dies when the process goes. Very convenient and has enabled me to provide sql access to an application cache which was cool. I suppose I can always create a db in temp space and delete it on dispose if I have to but I would prefer the mem model. Currently I have locked my apps at 2,.2.6 and not upgrading. I think I would look at switching to system.data.sqlite as an alternative if this feature does not come back. Also makes testing a snap.

ajcvickers commented 4 years ago

@grendo Where are you using Mode=Memory?

jbogard commented 4 years ago

My vote is for removing. There are far too many differences of ANY in-memory database and one that writes to disk. Transactions are the biggest one, of course.

Just because in-memory is faster does not make it a good substitute for the real thing. Our integration tests each execute multiple transactions, exposing bugs that would only be caught with real transactions and not just writing to memory.

There is a performance/speed hit, but the tradeoff is more correct tests vs. speed.

floyd-may commented 4 years ago

image

I worry that this is throwing the baby out with the bathwater. I find a lot of value in the in-memory provider. Yes, of course, there are differences between its behavior and a Real Database ™️ but that's why we have the testing pyramid. It's a mock, no different than any other mock. Just because mocks have limitations doesn't mean they're worthless.

jbogard commented 4 years ago

@floyd-may That's....not accurate. In-memory does much, much less than a real LINQ query provider. It doesn't translate to SQL, doesn't hit the database, doesn't do transactions. That's the bulk of what the real provider does.

I never use in-memory anything for this exact reason, nor do I ever mock out the database to begin with.

That doesn't even begin to get into "what if the best solution is a SQL query", in-memory completely falls down there even though it's exposed in SQL.

Removing this will make some people upset, but their tests (and design) will be better for it.

davidroth commented 4 years ago

Just because in-memory is faster does not make it a good substitute for the real thing. Our integration tests each execute multiple transactions, exposing bugs that would only be caught with real transactions and not just writing to memory.

Nobody is promoting InMemory for integration testing in this thread. We also use the real db for integration testing when transactions are involved or when db specific behavior is significant enough. The fact that using InMemory provider for these kind of tests is suboptimal, does not make its other use-cases less useful.

There is a performance/speed hit, but the tradeoff is more correct tests vs. speed.

Not if you know its limitations and use it for unit tests where its usage is appropriate. We see big speed diff between InMemory and real DB tests when these kind of tests sum up.

I never use in-memory anything for this exact reason, nor do I ever mock out the database to begin with.

That`s totally fine. But we and many others have different projects/experiences and it does make sense for us to use InMemory and the real-db side by side in our projects.

but their tests (and design) will be better for it.

Thats just a generalization and not a solid argument for removing it IMO. You cannot infer from your projects to everyone else.

davidfowl commented 4 years ago

Please keep the in memory provider and bake it into EF, it made my todo app that much easier for beginners to the stack (https://github.com/davidfowl/Todos/blob/a32c64b43a46ef47244f7c51b570950fcc7b002d/TodoBasic/TodoDbContext.cs#L11) and I don't need to copy all of those crazy RID folders for SQLite

jbogard commented 4 years ago

@davidfowl why not make the beginner story better with a "real database" like LocalDb?

vernou commented 4 years ago

That's....not accurate. In-memory does much, much less than a real LINQ query provider. It doesn't translate to SQL, doesn't hit the database, doesn't do transactions. That's the bulk of what the real provider does.

SQLite provider hasn't transaction and some NoSQL provider don't translate to SQL. SQLite and Cosmos provider aren't real provider?

jbogard commented 4 years ago

@davidroth I can understand folks have had some success with in-memory - we did too. We ultimately abandoned any such efforts on our projects because the amount of incorrectness outweighed the correctness. LINQ providers are crazy complex - and insufficiently approximated by dropping down to in-memory execution.

Ultimately, we found that in-memory compromised and influenced our designs - we would try to fit solutions into what was possible/impossible, so our efforts were far better spent designing our systems in such a way where the read and write paths were naturally isolated from business logic, rather than stubbed/mocked/faked.

jbogard commented 4 years ago

@Orwel they're real providers if they talk to the actual underlying database. I wouldn't swap SQLite for SQL for tests, though.

I did, years ago, and ultimately it was a mistake.

davidfowl commented 4 years ago

@davidfowl why not make the beginner story better with a "real database" like LocalDb?

Because it doesn't work on linux or OSX.

floyd-may commented 4 years ago

I'd be lying if I didn't admit that yes, misuse/overuse of the in-memory provider without testing against an actual database will likely cause problems. But guess what? Pretty much every technology out there has the opportunity to be a footgun. Mocks can be dangerous, yet I'd be pretty naive to automatically assume a project that makes use of Moq library has design issues.

The value of the in-memory provider that I've found is because EF client code often needs to have some complexity in it. Filtering, ordering, grouping, entity tracking, and so on - there's a lot of business logic that is, honestly, best put into EF code for many, many use cases. I can very rapidly cover that kind of logic using the in-memory provider, with the full knowledge that, yes, I need to validate it against a real database too. The in-memory provider covers a huge majority of EF use cases. Having fine-grained, fast tests that give me good locality of failure helps me get a majority of the coverage I need, then, I can cover the database-specific stuff with separate, slower, Real Database ™️ tests too. I think the balance between how much of which kind of test depends largely on the use cases of the application itself. The in-mem provider may be a poor choice for testing some applications, but certainly not all applications.

I absolutely agree with @gfoidl on adding some very clear caveats to the docs, and I'd be glad to help contribute to the docs as well.

vfabing commented 4 years ago

Following the test pyramid, I will run more often my in-memory db integration tests along with my unit tests (each push, pull-requests, live in VS), but ultimately will allow to merge only when integration tests with sql db (which take significantly more time) were successfully run.

Fail fast, learn faster on that one?

On the other hand:

Would be glad to help if needed :)

isen-ng commented 4 years ago

I agree that InMemory provider should be kept. However, I would propose some breaking changes. The InMemory database should be fitted with restrictive defaults such as,

These defaults, however, should be allowed to be explicitly turned off by the user.

eg.

    options.UseInMemory("MyDatabase")
        .WithCaveats(c => { 
            c.ThisIsNotARealDb = Acknowledged;
            c.TurnOffClientEvalErrors = true;
        }
smitpatel commented 4 years ago

It should main EFCore features for all databases like the client eval errors

InMemory has client eval errors too for things InMemory cannot evaluate on client. Just like any other provider. When provider fails to evaluate something on server, the exception will be thrown.

dasMulli commented 4 years ago

Please keep the InMemory provider!

It is a real life and time (!) saver for testing. But not for testing data access logic!

Most of the arguments against the InMemory provider seem to be "it doesn't behave like a real database (provider)". And it never claimed to. So doesn't any IRepository<T> / IUnitOfWork abstraction layer people have written to cut out complicated test setup for when you want to test business logic. People did this with EF6, only to find out that their mock setup for their abstraction layers did not behave like the OR engine for even something as "simple" as updating entity relationships on SaveChange().

EF Core finally gave us the chance to just plain out delete all this complex code and use DbSet<T> instead of hand-rolled and half-heartedly faked IRepository<T> implementations.

In short, InMemory steps in when you want the behaviour of EF Core as part of your test but not the actual data access logic.

So, if the only real, non-pit-of-failure approach is to not use the in-memory provider for testing, then maybe we shouldn't ship in 5.0?

It is not your responsibility to safeguard all users. InMemory is a great tool in a toolbox that solves a specific set of problems. Not all.

This is similar to discussions on newer C# language features like default interface methods - yes you can do horrible things with it but that's not a reason not to implement them if it is a solution to a specific set of problems. (And i could go on about how horribly code can be written with just if statements alone)

jukkahyv commented 4 years ago

Before InMemory provider we had an interface for database that exposes LINQ queries and CRUD operations. Obviously InMemory provider is better than that because at least you can test navigations and lazy loading with it.

I've also done LocalDB tests, and while those are closer to production, they are also much much slower. If you have tests that write to database you'll need to clear it between all tests. In my experience you can test 95% of the things just fine with the InMemory provider. For the remaining 5% you can create LocalDB tests.

It's been a while since I last tried SqLite, but it seems it still has some pretty serious limitations, so it's not exactly a replacement for LocalDB.

InMemory provider is one of the main reasons I've been recommending EF Core instead of EF 6. I find it absolutely invaluable.

AndriySvyryd commented 4 years ago

If you have tests that write to database you'll need to clear it between all tests.

You can usually use a transaction per test and roll it back.

jukkahyv commented 4 years ago

If you have tests that write to database you'll need to clear it between all tests.

You can usually use a transaction per test and roll it back.

Yes, as long as your code doesn't use transactions, then you'll need to write your own transaction provider or deal with transaction scopes. Nothing impossible, but still something you don't need to worry about with the InMemory provider. In general I'd just want to ignore transactions in tests. If there's an exception and the transaction would be rolled back, the test fails.

xsoheilalizadeh commented 4 years ago

EF Core in-memory provider is for unit testing, yes it' not like a real database, this is what unit testing needs. We resolved our mocking concerns with the repository pattern before, which brings additional complexities while ef core implemented it by representing DbSet. Using it in integration testing is a bad use-case, like other programming mistakes it doesn't mean that we should remove it.

tpetrina commented 4 years ago

Tests using in-memory provider are unit tests, not integration tests. It is extremely valuable and the alternative is...terrible.

Besides, you can literally rerun the same tests against in-memory provider, SQL-in-a-docker, Postgre, etc. So you get the benefit of properly mocking context when the point of the test is the logic with the ability to switch to the real provider once you are happy.

irperez commented 4 years ago

We have an application that has very intense business logic. So much so we run over 50k unit tests that run prior to a commit. In the past we attempted to write those tests via integration tests. The test run would be hours long. NOT FEASIBLE FOR CI! Yes there is a place for integration tests. Test Pyramid.

When .net core came out I was doubting the adoption of EF Core, but the in-memory provider and the unit testability is what sold me. We are able to run so many unit tests prior to commit. It is an enormous time saver. Removing such a feature would leave our team to go to another solution for data access. We are not using the in-memory provider to test CRUD, but rather to test higher level business logic and yet on Linux/Kubernetes.

Keep in mind this removal would be an enormous breaking change to users who have spent a number of years trusting you won't take away their testing capabilities. By removing such a feature, you're also saying to the community you can't be trusted with such a low level API. We need these APIs to be stable and not introducing breaking changes. We were promised minimal to no breaking changes going forward. We took a risk using EF Core. Please do not reward us by removing this feature.

For those that don't want to use in-memory provider, Don't! But don't take it away from the developers that have come to depend on it.

ajcvickers commented 4 years ago

TL;DR:

We really appreciate the great discussion that this issue has generated. I think it is pretty clear that while to some people the in-memory provider is not needed, to others it is essential to their development flow. Also, telemetry from VS and VS Code indicates that many developers are referencing the in-memory provider assembly/NuGet package. Therefore, we are not going to remove the in-memory provider.

We have already begun updating our testing guidance to emphasize the trade-offs involved in different testing approaches. We're also going to update this guidance with some samples of different techniques for using the in-memory provider and other providers effectively in tests.

Some features in the in-memory provider (especially query translations) can be very expensive relative to the value they bring. We may de-emphasize these kinds of fixes to free up more resources for other areas. However, we will also consider feedback and will still consider making fixes for things where the demand is high.

hauntingEcho commented 3 years ago

For anyone else reaching this topic while investigating the in-memory database for testing - there is a dotnet implementation of Java's fantastic TestContainers library, which manages per-test docker containers for your database (or anything else you're integrating with that can be dockerized) and will keep the behavior closer to your database of choice.

sebastianstudniczek commented 1 year ago

Some features in the in-memory provider (especially query translations) can be very expensive relative to the value they bring. We may de-emphasize these kinds of fixes to free up more resources for other areas. However, we will also con

Would you use it in unit testing?