vierge-noire / cakephp-fixture-factories

CakePHP Fixture Factories
https://vierge-noire.github.io/
MIT License
85 stars 20 forks source link

Recurrent PersistenceExceptions thrown in factories due to Primary Key collisions #183

Closed jamisonbryant closed 2 years ago

jamisonbryant commented 2 years ago

Hello, this is the issue I've been talking about in the CakePHP #support Slack recently. I'm going to do my best to type it up here and explain all my debugging methods I've attempted thus far, but I'm pretty much stumped.

Please let me know if any additional information is required.

The Problem

I get errors like these from 17 different tests:

CakephpFixtureFactories\Error\PersistenceException : Error in Factory App\Test\Factory\AccessTokenFactory.
 Message: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'yfwOHj3X2SGe4914XD721lrG90dGr0eBzWJRGPMH' for key 'PRIMARY' 

The Setup

AccessTokenFactory looks like this:

    protected function setDefaultTemplate(): void
    {
        $this->setDefaultData(function (Generator $faker) {
            /* for clarity, let's call this the "default data array" */
            return [
                'oauth_token' => $faker->regexify('[a-zA-Z0-9]{40}'),        // <-- this is the primary key
                'client_id' => $faker->regexify('[a-zA-Z0-9]{32}'),
//                'user_id' => $faker->uuid,
                'expires' => $faker->unixTime,
            ];
        });
    }

Notes:

An example test:

    public function testAuthenticateReusesExistingAccessTokenIfPresent()
    {
//        $client = ClientFactory::make()->with('Users.Roles')->persist();
        $client = $this->Clients->find()->contain(['Users'])->first();
        $creds = base64_encode(sprintf('%s:%s', $client->client_id, $client->client_secret));

        $time = FrozenTime::now()->addHour(3);
        $testToken = AccessTokenFactory::make([
            'oauth_token' => '4308be68821e7a952691de06395ea401ce368424',
            'client_id' => $client->client_id,
            'expires' => $time->format('U'),
        ])->withUsers($client->user)->persist();

       // ...

        $this->post('/users/authenticate');

       // ...
    }

Notes:

Example fixture:

class AccessTokensFixture extends TestFixture
{
    public $connection = 'my_test_connection';            // Important: all my tables are located on a non-standard connection

    /**
     * Init method
     *
     * @return void
     */
    public function init(): void
    {
        AccessTokenFactory::make(50)->with('Users.Roles')->persist();
        AccessTokenFactory::make(['expires' => FrozenTime::now()->addDay()->format('U')], 50)->with('Users.Roles')->persist();

        parent::init();
    }
}

Notes:

Other Notes

Expected Behavior

My entities are generated with unique primary keys every time, and all entities are dropped in between test runs (*caveat: I know some strategies clean up tables before or after, please take my meaning to be the correct one regardless of what I said)

jamisonbryant commented 2 years ago

TODO:

jamisonbryant commented 2 years ago

What has changed

I have dwindled down the debug test suite to the ~three~ two tests that, when run together, cause EITHER an integrity constraint violation in the primary key column OR an entity save failure:

Both of these test classes have been modified to use the TruncateDirtyTables trait as well as generate/persist their entities with the fixture factories (no more transactions, no more static fixtures). I've also updated to the latest minor version of the fixture factories plugin.

Here is the default data for the factory that is having the problem:

protected function setDefaultTemplate(): void
{
    $this->setDefaultData(function (Generator $faker) {
        $faker->seed();    // note: now seeding faker manually
        $token = $faker->regexify('[a-zA-Z0-9]{40}');
        echo sprintf("    - Generated token: \"%s\"\n", $token);

        return [
            'oauth_token' => $token,
            'client_id' => $faker->regexify('[a-zA-Z0-9]{32}'),
              'user_id' => $faker->uuid,
            'expires' => $faker->unixTime,
        ];
    });
}

Current error(s)

Two different exceptions get thrown by the broken test suite, in different quantities per test run.

I used this command to run the broken test suite 10 times, and then ran the command three times:

vendor/bin/phpunit --testdox --testsuite debug --repeat 10

Then applied some grep magic to the output to sort and count exceptions thrown, output was:

## InternalErrorException = my app, entity save failure; PersistenceException = this plugin, constraint violation
# 1st run
  34    │ Cake\Http\Exception\InternalErrorException
 106    │ CakephpFixtureFactories\Error\PersistenceException

# 2nd run
  30    │ Cake\Http\Exception\InternalErrorException
 110    │ CakephpFixtureFactories\Error\PersistenceException

# 3rd run
  26    │ Cake\Http\Exception\InternalErrorException
 114    │ CakephpFixtureFactories\Error\PersistenceException

# 4th run
  34    │ Cake\Http\Exception\InternalErrorException
 104    │ CakephpFixtureFactories\Error\PersistenceException
# (note: this run doesn't even add up to 140 total errors, like all the others)

Question/Comments

So, where I'm confused at the moment is why the exceptions would be happening in different quantities each for each run of the test suite when nothing at all is changing about the test suite in between the runs. It would make sense to me for the numbers to never change and be the same between runs.

jamisonbryant commented 2 years ago

Another lens into the problem (or at least, a related one) which shows that "when" the user saves can cause errors to occur in different tests over different runs:

Scenario: 4 different tests ~

Over 10 repeats of the test suite saw errors thrown in tests 1 and 3 sometimes, 2 and 3 sometimes, 3 only sometimes, and sometimes not at all. With zero change to the tests code in between.

The exact error is an InternalErrorException generated when the application is unable to create a particular entity (NOT a factory entity, a real one) because it has a dependency on a User entity (factory-made) that is supposed to be persisted at that time but is not:

// these are validation errors for an application-generated entity, not from a factory
// in this case it depends on a factory-generated User entity, which is supposed to be persisted by now
{
    "user_id": {
        "_existsIn": "The specified user ID does not exist in the Users table"
    }
}

So it seems like the tests can sometimes get ahead of ->persist(), or at least that is my takeaway.

jamisonbryant commented 2 years ago

It would also appear that seeding the generator in any of these manners (at the top of the default data for the factory):

$faker->seed(md5(microtime()));
// or
$faker->seed(microtime());
// or
$faker->seed();

does not resolve the issue either, as tests are still failing with an integrity constraint violation.

Interestingly, this simple sanity test:

public function testSanity()
{
    $entities = AccessTokenFactory::make(1000)->getEntities();
    $tokens = Hash::extract($entities, '{n}.oauth_token');
    $distinct = array_unique($tokens);
    $this->assertEquals(1000, count($distinct));
}

passes only when seeding in the final manner, i.e. passing no arguments to $faker->seed(). When seeding with a hashed microtime, you get about ~300 unique tokens, sometimes more. When seeding with just the microtime, you only get one (singular) unique token.

jamisonbryant commented 2 years ago

On a recommendation from someone I created a new homebrew random string generator function and my factory now calls this method instead of $faker->regexify(...).

This appears to resolve the integrity constraint violations at first glance, but I need to finish repairing my test suite fully (from all the changes trying to diagnose this issue) to say for certain.

I will update this comment with a final conclusion when it becomes available, but it appears as if the problem may actually lie with the FakerPHP library not being able to generate enough randomness for regexify(), even when seeded.

jamisonbryant commented 2 years ago

Closing as it appears there is a strong possibility the defect is in the Faker generator (though what it could be I have been unable to tell, as the results of test runs are so random and irreproducible). The fix, in this case, was to homebrew a random string generator and use that instead of $faker->regexify(). Thanks, all.