InMemory: Improve in-memory key generation

ardalis commented 7 years ago

The Issue

For testing purposes, you should be able to delete, recreate, and reseed InMemory databases and the result should be the same for each test. Currently identity columns do not reset, so IDs increment with each test iteration.

Repro Steps

This test fails. It should pass.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using Xunit;

namespace EfCoreInMemory
{
    public class Item
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }
    public class AppDbContext : DbContext
    {
        public DbSet<Item> Items { get; set; }

        public AppDbContext(DbContextOptions<AppDbContext> options ):base(options)
        {
        }
    }
    public class Tests
    {
        private static DbContextOptions<AppDbContext> CreateNewContextOptions()
        {
            // Create a fresh service provider, and therefore a fresh 
            // InMemory database instance.
            var serviceProvider = new ServiceCollection()
                .AddEntityFrameworkInMemoryDatabase()
                .BuildServiceProvider();

            // Create a new options instance telling the context to use an
            // InMemory database and the new service provider.
            var builder = new DbContextOptionsBuilder<AppDbContext>();
            builder.UseInMemoryDatabase()
                   .UseInternalServiceProvider(serviceProvider);

            return builder.Options;
        }

        [Fact]
        public void Test1()
        {
            // create a brand new dbContext
            var dbContext = new AppDbContext(CreateNewContextOptions());

            // add one item
            var item = new Item();
            dbContext.Items.Add(item);
            dbContext.SaveChanges();

            // ID should be 1
            Assert.Equal(1, item.Id);

            dbContext.Database.EnsureDeleted();

            Assert.False(dbContext.Items.Any());

            var item2 = new Item();
            dbContext.Items.Add(item2);
            dbContext.SaveChanges();

            // ID should STILL be 1
            Assert.Equal(1, item2.Id);

        }
    }
}

Further technical details

Project.json:

{
  "version": "1.0.0-*",
  "testRunner": "xunit",
  "dependencies": {
    "NETStandard.Library": "1.6.0",
    "xunit": "2.2.0-beta2-build3300",
    "dotnet-test-xunit": "2.2.0-preview2-build1029",
    "Microsoft.AspNetCore.Server.Kestrel": "1.0.0",
    "Microsoft.AspNetCore.TestHost": "1.0.0",
    "Microsoft.EntityFrameworkCore": "1.0.0",
    "Microsoft.EntityFrameworkCore.InMemory": "1.0.0"
  },
  "frameworks": {
    "netcoreapp1.0": {
      "imports": [
        "dotnet5.6"
      ],
      "dependencies": {
        "Microsoft.NETCore.App": {
          "type": "platform",
          "version": "1.0.0"
        }
      }
    }
  }
}

VS2015

robertmclaws commented 7 years ago

I don't know if I agree with this exact behavior, mostly because I'm having a hard time envisioning a unit test or operational scenario where you'd want to delete an existing item, then add a new item with the same ID.

HOWEVER, I do see the need for a command like dbContext.ResetIdTracking() that frees up any IDs not presently in the change tracker. You could then in this command at the end of each test function to reset the context for the next test.

ardalis commented 7 years ago

The repel shows one test. The problem exhibits itself when you have many tests. Every test will end up with different IDs - there is now way to really reset the in memory db.

Steve Smith

On Oct 26, 2016, at 15:26, Robert McLaws (Microsoft MVP) notifications@github.com wrote:

I don't know if I agree with this exact behavior, mostly because I'm having a hard time envisioning a unit test or operational scenario where you'd want to delete an existing item, then add a new item with the same ID.

HOWEVER, I do see the need for a command like dbContext.ResetIdTracking() that frees up any IDs not presently in the change tracker. You could then in this command at the end of each test function to reset the context for the next test.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ardalis commented 7 years ago

repro not repel

ajcvickers commented 7 years ago

Note for triage: dupe of #4096.

rowanmiller commented 7 years ago

The InMemory provider doesn't the Identity pattern you would get in a relational database. When a column is configured as ValueGeneratedOnAdd it uses values from a single generator for all tables. This would be similar to having all tables setup to use a single sequence in a relational database. This raises one of the important design principles of our InMemory provider, which is that we are not trying to emulate a relational database. This becomes more important as we start to light up non-relational providers.

If you want to have a database that acts like a relational database, but doesn't have the overhead of I/O, then we'd recommend using an In-Memory SQLite database - http://www.sqlite.org/inmemorydb.html.

We're going to update our testing docs to make the SQLite option more prominent.

ardalis commented 7 years ago

Thanks, @rowanmiller

2 things I'd like to see that I think would be useful:

A way to 100% reset the in-memory database, to use in testing. I don't care whether it behaves like a relational database or not - for test purposes I want a way to reset its state completely.
A testing sample that shows not just how to test EF directly, but how to test an app (like an ASP.NET Core app) that is using EF, resetting the database after each test. The current guidance on how to do this is not trivial to achieve using ASP.NET Core's standard Startup.ConfigureServices code used to configure EF.

Thanks!

rowanmiller commented 7 years ago

@ardalis fair point on point 1, I'm re-opening this issue for us to re-discuss.

Regarding point 2, our docs show an example that tests a BlogService, which would be the same as testing an MVC controller etc. - https://docs.efproject.net/en/latest/miscellaneous/testing.html. For testing, you generally wouldn't be using Startup.cs to setup all your services, you'd be creating your test doubles of everything the controller depends on and passing them in. The example I linked to shows how to do that for EF, but you'd have to do it for the other dependencies too.

robertmclaws commented 7 years ago

I do want to point out that I made a suggestion on how to handle what @ardalis is asking for in point 1 with a command like dbContext.ResetIdTracking(), or something to that effect.

ardalis commented 7 years ago

@rowanmiller You're right, unit testing an individual controller action works just like your BlogService sample. To clarify, I'm talking about integration testing using WebHostBuilder, as shown here: https://docs.asp.net/en/latest/testing/integration-testing.html. In this case, there's no direct access to the controller action. The web host is constructed for each test and the request is made such that it goes through the full MVC stack (routing, filters, etc). This is an awesome new feature of ASP.NET Core but getting the data reset for each test has proven much more difficult than I would like. To be fair, I haven't yet resorted to using SqlLite for this purpose, but my hope has been to avoid the need for this since it adds complexity to the process. Thanks.

ardalis commented 7 years ago

...and regarding that link, you'll see it doesn't include any EF examples. That's because I couldn't figure a great way to include them such that they were reset between each test.

ardalis commented 7 years ago

Here's a complete example showing how to (try to) configure an ASP.NET Core app using WebHostBuilder and InMemory for integration tests: https://github.com/ardalis/CleanArchitecture/blob/master/src/CleanArchitecture.Web/Startup.cs#L73-L104

The sample shown works, but starts to fall down as you add more entities and tests to it, as test data starts to step on other test data. One workaround I was able to use was hardcoding the IDs of test data, as suggested in another issue. However, any guidance or assistance with how to reset data for each integration test like this would be appreciated.

rowanmiller commented 7 years ago

@ardalis what if you used a throw away instance of the InMemory database for each test? There is an overload of UseInMemoryDatabase that takes a string (the named instance to connect to).

ardalis commented 7 years ago

I tried that but the numbers still incremented every time. I ended up working around it by hard-coding the initial ID. I had tried this:

        services.AddDbContext<AppDbContext>(options =>
            options.UseInMemoryDatabase(Guid.NewGuid().ToString()));

But even with that in place, and with disposing of the TestServer after every test, the ID numbers would continue incrementing between tests (which made it hard for me to construct API calls like /item/{id} when I couldn't count on a valid test ID to use).

Again this was fixed by specifying the ID in my test data method. A working version of what I have currently can be found here: https://github.com/ardalis/ddd-guestbook/tree/Lab5Start

rowanmiller commented 7 years ago

Note for triage: One idea we had in triage was replacing some services to make the value generator resettable, but it involves copying a lot of code because the value generator is cached - and can't be replaced. You have to re-implement InMemoryValueGeneratorSelector to keep a handle of the created generators (so that you can reset them) and then re-implement InMemoryIntegerValueGeneratorFactory and InMemoryIntegerValueGenerator to make them resettable. You have to copy the code, as the public surface doesn't give you the hooks you need.

ajcvickers commented 7 years ago

@rowanmiller Does this work:

modelBuilder
    .Entity<Blog>()
    .Property(e => e.Id)
    .HasValueGenerator<InMemoryIntegerValueGenerator<int>>();

ajcvickers commented 7 years ago

@ardalis @robertmclaws Here is some code that might work for you:

public static class DbContextExtensions
{
    public static void ResetValueGenerators(this DbContext context)
    {
        var cache = context.GetService<IValueGeneratorCache>();

        foreach (var keyProperty in context.Model.GetEntityTypes()
            .Select(e => e.FindPrimaryKey().Properties[0])
            .Where(p => p.ClrType == typeof(int)
                        && p.ValueGenerated == ValueGenerated.OnAdd))
        {
            var generator = (ResettableValueGenerator)cache.GetOrAdd(
                keyProperty,
                keyProperty.DeclaringEntityType,
                (p, e) => new ResettableValueGenerator());

            generator.Reset();
        }
    }
}

public class ResettableValueGenerator : ValueGenerator<int>
{
    private int _current;

    public override bool GeneratesTemporaryValues => false;

    public override int Next(EntityEntry entry)
        => Interlocked.Increment(ref _current);

    public void Reset() => _current = 0;
}

To use, call context.ResetValueGenerators(); before the context is used for the first time and any time that EnsureDeleted is called. For example:

using (var context = new BlogContext())
{
    context.ResetValueGenerators();
    context.Database.EnsureDeleted();

    context.Posts.Add(new Post {Title = "Open source FTW", Blog = new Blog {Title = "One Unicorn"}});
    context.SaveChanges();
}

No matter how many times I call this code, Blog.Id and Post.Id will always get the value 1.

The code works by finding every int primary key in the model and setting the cached value generator to a ResettableValueGenerator, or resetting that value generator if it has already been created. It can be easily adapted for other key types.

divega commented 7 years ago

Moving to backlog to consider:

Automatically resetting value generators when in-memory database is dropped (value generators would need to be associated with the database instead of the model)
Add identity behavior on the in-memory database (and stop emulating it with client-side value generation)
Add an explicit API to reset value generation in the product.

brockallen commented 7 years ago

FYI I ran into this today where the state of the in-mem database is preserved across each run of my unit tests (and each test creates a whole new DI system per test with a new ServiceCollection, etc.). I can't imagine I'd ever want to preserve the old data across each test, since unit tests don't run in a consistent order, and they might run in parallel.

This behavior seems like a strange choice by default if this is being positioned for unit testing. Why isn't the in-mem database just a singleton when added to DI?

As a workaround I'm using Guid.NewGuid() to set the database name on each run, but it feels uncomfortable to have to do so.

ErikEJ commented 7 years ago

@brockallen The InMemory provider is not a relational database provider, current recommendation is to use SQLite (or maybe SQL Compact?) instead: https://docs.microsoft.com/en-us/ef/core/miscellaneous/testing/

brockallen commented 7 years ago

@ErikEJ I'm confused how that relates. I don't care what DB is used -- IMO, between tests it should start from scratch/empty.

RichiCoder1 commented 7 years ago

bump Ran into this ourselves. I'm also baffled that this is the default behavior. What is the intended use case of InMemory if not testing?

spottedmahn commented 6 years ago

@rowanmiller

We're going to update our testing docs to make the SQLite option more prominent.

I started with the SQLite user guide but quickly stopped once I learned schemas are not supported.

james-mint commented 6 years ago

So just to clarify, when you spin up multiple in memory db contexts (as you would in unit testing) all identity fields share the same generator so that when I create a record in TableA in one database I get ID = 1, I then create another record in TableA in a second db and I get ID = 2? If so why would you ever want that? It's not replicating real db behavior which would maintain identities separately.

brockallen commented 6 years ago

Since this was bumped, any update on it? Or simply commentary on the rationale?

ajcvickers commented 6 years ago

@james-mint Different databases and configurations will generate values in different ways. The in-memory database is not intended to try to mimic any of these, but rather to just implement generated values in some way that fits the EF contract. If your tests rely on specific database generated values, then your tests are implicitly coupled to what a given database will do. This is a pretty dangerous thing to do when testing with the in-memory provider since there is no guarantee it will ever work in the same way as any given database backend.

ajcvickers commented 6 years ago

@brockallen Can you be more specific about what you are asking?

brockallen commented 6 years ago

Either will there be a fix, or if not then why was it done this way? I've still not been convinced, assuming unit testing is the motivation, why there is mutable shared state maintained across test boundaries.

ajcvickers commented 6 years ago

@brockallen Open in the backlog milestone means that we do intend to implement this, but it is not scheduled for an upcoming release.

The motivation was that store-generated IDs can have any value and that tests should not rely on a specific value. However, it is agreed in this issue that we will at least couple it to the database instance, rather than make it global for all databases.

brockallen commented 6 years ago

Ok, thanks for the update.

ajcvickers commented 6 years ago

When working on this issue, consider the suggestion in #9683 to make the key generator smarter.

crhairr commented 6 years ago

We ran into this issue yesterday. To work around it, I just added a single line to replace the IValueGeneratorCache at the beginning of each test:

            var db = new DbContextOptionsBuilder();

            db.UseInMemoryDatabase(Guid.NewGuid().ToString())
                .ReplaceService<IValueGeneratorCache, InMemoryValueGeneratorCache>(); //this line

It would still be nice if this wasn't necessary, as it's not always straightforward to determine the cause of id-related failures.

SimonCropp commented 6 years ago

any update on this one?

ajcvickers commented 6 years ago

This issue is in the Backlog milestone. This means that it is not going to happen for the 2.1 release. We will re-assess the backlog following the 2.1 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

ajcvickers commented 6 years ago

Note: consider discussion in #10940 when implementing this.

SimonCropp commented 6 years ago

so i needed to work around this enough times that i wrapped the workaround in a nuget https://github.com/SimonCropp/EfCore.InMemoryHelpers

camainc commented 6 years ago

Just an FYI:

The code example by @ajcvickers works if you do not have any query types defined in your DbContext. If you do, might get a null reference error when it tries to find the primary key (my query result objects have no primary key defined).

I had to change this line:

foreach (var keyProperty in context.Model.GetEntityTypes()

to this:

foreach (var keyProperty in context.Model.GetEntityTypes().Where(x => !x.IsQueryType)

to make the code work.

SimonCropp commented 6 years ago

@camainc thanks. verified with a test and added a fix. shipped in v 1.1.2 of EfCore.InMemoryHelpers

irperez commented 6 years ago

A lot of the examples above are working with integers for the identity generation. Would it be possible to also consider an implementation for in-memory sequential guid generation as part of this out of the box? I'm not sure if anyone expressed interested in this or not. But this would be very useful for our company.

ajcvickers commented 6 years ago

@irperez There shouldn't be any reason you can't use SequentialGuidValueGenerator with the in-memory database.

chamikasandamal commented 5 years ago

Any update on this?

ajcvickers commented 5 years ago

This issue is in the Backlog milestone. This means that it is not going to happen for the 2.2 or 3.0 releases. We will re-assess the backlog following the 3.0 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

Talegen commented 5 years ago

Still running into this. Please make the identity generation smarter as mentioned in related issues.

We want to set the identity of values to a specific number (stored in constants) in our initial test data so that we can use those constant Ids as inputs to our xUnit test methods.

We run into the issue with these values colliding with the InMemory database incrementor tracker.

The simple solution in my mind is have that tracker check the existing records first, get top most Id, and increment + 1 from that value.

This behavior is found in IDENTITY columns within SQL Server I do believe.

If I've hard-coded a record with Id=1, when adding another record with no Id specified, the incrementor should get records.

var newId = dbContext.Records.OrderByDescending(r => r.Id).Select(r => r.Id).FirstOrDefault() + 1;

Obi-Dann commented 5 years ago

https://github.com/aspnet/EntityFrameworkCore/issues/6872#issuecomment-258025241 is great (thanks @ajcvickers), but... turns out that it does not work with tests running in parallel.

I've ended up fixing it by providing a custom IServiceProvider to UseInternalServiceProvider that declares IValueGeneratorCache as instance per test run.

There's an example building the custom IServiceProvider using Autofac. Autofac supports named scopes making it reasonably easy to set up:

public class Test : IDisposable
{
    static readonly Lazy<IContainer> container = new Lazy<IContainer>(() =>
    {
        var collection = new ServiceCollection()
            .AddEntityFrameworkInMemoryDatabase();

        // replacing IValueGeneratorCache lifetime from singleton to instance per test run
        var valueGeneratorCacheDescriptor = collection.First(x => x.ServiceType == typeof(IValueGeneratorCache));

        collection.Remove(valueGeneratorCacheDescriptor);

        var builder = new ContainerBuilder();
        builder.Populate(collection);

        builder
            .RegisterType(valueGeneratorCacheDescriptor.ImplementationType)
            .InstancePerMatchingLifetimeScope("testrun")
            .As(valueGeneratorCacheDescriptor.ServiceType);

        return builder.Build();
    });

    readonly ILifetimeScope scope;
    readonly DbContext dbContext;

    public Test()
    {
        scope = container.Value.BeginLifetimeScope("testrun");

        var serviceProvider = scope.Resolve<IServiceProvider>();

        var optionsBuilder = new DbContextOptionsBuilder<CoreDbContext>();
        optionsBuilder.UseInMemoryDatabase(Guid.NewGuid().ToString("D"));
        optionsBuilder.UseInternalServiceProvider(serviceProvider);

        dbContext = new DbContext(optionsBuilder.Options);
    }

    public void TestRun()
    {
        // ...
    }

    public void Dispose()
    {
        dbContext?.Dispose();
        scope?.Dispose();
    }
}

ajcvickers commented 5 years ago

dotnet / efcore