dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.66k stars 3.15k forks source link

DbContextPoolingTest.Concurrency_test fails consistently on my machine #32700

Open ajcvickers opened 8 months ago

ajcvickers commented 8 months ago

Failed tests 11.4838529s✘ Microsoft.EntityFrameworkCore.DbContextPoolingTest.Concurrency_test(useInterface: False, async: False) Assert.InRange() Failure: Value not in range\r\nRange: (32 - 64)\r\nActual: 28 at Microsoft.EntityFrameworkCore.DbContextPoolingTest.Concurrency_test(Boolean useInterface, Boolean async) in D:\code\efcore\test\EFCore.SqlServer.FunctionalTests\DbContextPoolingTest.cs:line 2085 --- End of stack trace from previous location --- Output: [17:04:43.439] Requests: 2, RPS: 2 [17:04:44.557] Requests: 24, RPS: 20 [17:04:45.689] Requests: 88, RPS: 57 [17:04:46.851] Requests: 164, RPS: 65 [17:04:48.222] Requests: 265, RPS: 74 [17:04:49.597] Requests: 396, RPS: 95 [17:04:50.699] Requests: 632, RPS: 214 [17:04:51.798] Requests: 882, RPS: 227 [17:04:52.862] Requests: 1176, RPS: 276

Average RPS: 111 11.4982871s✘ Microsoft.EntityFrameworkCore.DbContextPoolingTest.Concurrency_test(useInterface: True, async: False) Assert.InRange() Failure: Value not in range\r\nRange: (32 - 64)\r\nActual: 24 at Microsoft.EntityFrameworkCore.DbContextPoolingTest.Concurrency_test(Boolean useInterface, Boolean async) in D:\code\efcore\test\EFCore.SqlServer.FunctionalTests\DbContextPoolingTest.cs:line 2085 --- End of stack trace from previous location --- Output: [17:04:55.002] Requests: 13, RPS: 12 [17:04:56.068] Requests: 86, RPS: 68 [17:04:57.117] Requests: 162, RPS: 72 [17:04:58.246] Requests: 276, RPS: 101 [17:04:59.380] Requests: 421, RPS: 128 [17:05:00.567] Requests: 573, RPS: 128 [17:05:01.674] Requests: 757, RPS: 166 [17:05:02.734] Requests: 1050, RPS: 276 [17:05:03.827] Requests: 1332, RPS: 258 [17:05:04.879] Requests: 1571, RPS: 227

Average RPS: 144

AndriySvyryd commented 8 months ago

Those values look really low. On what CPU is this?

ajcvickers commented 8 months ago

@AndriySvyryd Threadripper Pro 5975WX.

AndriySvyryd commented 8 months ago

Do they only fail when running in parallel with other tests? Perhaps we should disable parallel execution for these.

ajcvickers commented 8 months ago

I don't see anything in the tests that ever made this be reliable, regardless of parallel execution. The code executes 32 new tasks. Each one of those tasks takes a context from the pool, uses it and returns it to the pool. At one extreme, the first task manages to get all the way through this before any of the other tasks have had a chance to get a context from the pool. This means that the context is returned to the pool and the next task will reuse it. So the total number of contexts in the pool at the end will be 31, because two tasks made use of the same context instance. If three tasks use the same context instance, then the count will be 30, and so on. I've never seen anything lower than 22, and it would be pretty indicative of an error if we ever got to 1, so I opted to have a low number but not 1.

If we want to constrain the high bound, then we need to ensure that other tests don't use the same pool, but that won't change the lower bound.

Except I'm probably wrong--tell me how!

AndriySvyryd commented 8 months ago

Well, if you really want to make it reliable while still have a meaningful assert the context-specific task should increase a counter and block until the counter reaches 32. Then wait until all instances are returned to the pool and repeat the first step to make sure that the pooled instances are reused. After this there should be exactly 32 instances.

ajcvickers commented 8 months ago

Do you want me to do this? Doesn't seem to add much value.

AndriySvyryd commented 8 months ago

It would catch bugs in Rent, Return and pool defaults. Sounds like good ROI

ajcvickers commented 8 months ago

Okay. Then I'll put this back on the backlog and send out a new PR to just disable the test for now, since its assertions are incorrect.