fzaninotto / Faker

Faker is a PHP library that generates fake data for you
MIT License
26.8k stars 3.57k forks source link

Doctrine populator with generated Primary Key has Id collisions (batch size of 1) #2003

Closed ScottA38 closed 4 years ago

ScottA38 commented 4 years ago

Summary

I am currently writing tests and seeding random data into entities and persisting using the Populator. The ID for my Doctrine objects is: [1st 3 consonants of name property][number calculated as 1 greater than the count of existing name collisions][1st 2 consonants of entity class name]

For instance, in my Book class, with a name of 'Harry Potter', I would have: HRR1BK

Versions

Version
PHP 7.4.4
fzaninotto/faker 1.9.1

Self-enclosed code snippet for reproduction

//Main.php
$em = EntityManager::create($dbParams, $config);

$populator = new EntityPopulator($em);
$result = $populator->populate(Furniture::class, 10);
var_dump($result);

N.B: look at 'populate' in this 'EntityPopulator' wrapper class - please ignore things such as the foreach loop

<?php

declare(strict_types=1);

namespace WebApp\Util;

use Doctrine\ORM\EntityManager;
use Faker\ORM\Doctrine\Populator;
use Faker\Factory;
use WebApp\Models\Product;
use Faker\Generator;

class EntityPopulator
{
    private EntityManager $em;
    private Generator $generator;

    public function __construct(EntityManager $em)
    {
        $this->em = $em;
        $this->generator = Factory::create();
    }

    private function getspecialInstructions()
    {
        $nameReducer = function ($carry, $lChar) {
            if (!in_array($lChar, ["a", "e", "i", "o", "u"])) {
                $carry++;
            }
            return $carry;
        };
        return [
            'price' => function () {
                return $this->generator->randomFloat(2, 0, 10000);
            },
            'name' => function () use ($nameReducer) {
                $name = $this->generator->unique()->company;
                while (array_reduce(str_split(strtolower($name)), $nameReducer) < 3) {
                    $name = $this->generator->unique()->company;
                }
                return $name;
            },
            'dimensions' => function () {
                return [
                    $this->generator->randomNumber(3),
                    $this->generator->randomNumber(3),
                    $this->generator->randomNumber(3)
                ];
            }
        ];
    }

    public function getEntityManager(): EntityManager
    {
        return $this->em;
    }

    /**
     * Populate the database with a given amount of DBAL entity. Returns PK of each popluated element
     * @param Product $entity
     * @param int $amount
     * @return array
     */
    public function populate(string $className, int $amount)
    {
        $metadata = $this->em->getClassMetadata($className);
        $fieldNames = array_keys($metadata->fieldMappings);
        $specialInstructions = $this->getspecialInstructions();
        $instructionKeys = array_keys($specialInstructions);
        foreach ($instructionKeys as &$instructionKey) {
            if (!in_array($instructionKey, $fieldNames)) {
                unset($specialInstructions[$instructionKey]);
            }
        }
        $populator = new Populator($this->generator, $this->em, 1);
        $populator->addEntity($className, $amount, $specialInstructions);
        return $populator->execute();
    }

    /**
     * Cheeky helper function that should be implemented in a concrete class controller
     * @param array $entities
     */
    public function removeAllEntities(array $entities)
    {
        foreach ($entities as &$ent) {
            $this->em->remove($ent);
        }
        $this->em->flush();
    }
}

My doctrine sku generator:

<?php

declare(strict_types=1);

namespace WebApp;

use Doctrine\ORM\Id\AbstractIdGenerator;
use WebApp\Models\Product;
use Doctrine\ORM\EntityManager;

class SkuGenerator extends AbstractIdGenerator
{
    /**
     * Generate an SKU property
     * {@inheritDoc}
     * @see \Doctrine\ORM\Id\AbstractIdGenerator::generate()
     * @param mixed (has to be mixed because mock entity manager might be used)
     * @param
     */
    public function generate(EntityManager $em, $entity): string
    {
        //sanity-checking the parameter
        assert(is_subclass_of($entity, 'WebApp\Models\Product'));

        $baseClassPath = explode("\\", get_class($entity));
        $className = end($baseClassPath);

        $categoryInitialism = SkuGenerator::initialismGenerator($className, 2);
        $nameInitialism = SkuGenerator::initialismGenerator($entity->getName(), 3);

        $num = count($em->getRepository(get_class($entity))->findBy(['name' => $entity->getName()])) + 1;

        return $nameInitialism . $num . $categoryInitialism;
    }

    /**
     * Returns the first `$length` consonants of a string in lowercases
     * @param string $input
     * @param int $length
     * @return string|boolean
     */
    public static function initialismGenerator(string $input, int $length)
    {
        $out = "";
        $upper = strtoupper($input);
        $upper_array = str_split($upper);
        for ($i = 0; $i < count($upper_array); $i++) {
            if (!in_array($upper_array[$i], array("A", "E", "I", "O", "U", " ", "-"))) {
                $out .= $upper_array[$i];
            }
            if (strlen($out) >= $length) {
                return $out;
            }
        }
        throw new \LengthException('An initialism cannot be created from ' . $input . ', too few consonants');
    }
}

Expected output

A list of persisted entities with incremented numbers in the SKU (sometimes when no generated SKU collision)

Actual output

PHP Fatal error:  Uncaught PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'SCH1FR' for key 'Furniture.PRIMARY' in ... [file_path]

When I persist and flush entities manually (such as in a previous version of my 'main.php' file as shown above, the incrementing of the SKU number when there would be a primary-key collision works fine, which leads me to believe that me specifying a 'batchSize' parameter of 1 in the Populator constructor is not causing a flush to the database on every entity.

ScottA38 commented 4 years ago

EDIT: I need to amend this Issue - I did not read the input parameters of [Populator]->addEntity(), as the final parameter is $generateId.

When using this parameter the issue I get a further error:

PHP Fatal error:  Uncaught Error: Call to a member function toArray() on array in [base_path]/shopping_app_test/vendor/fzaninotto/faker/src/Faker/ORM/Doctrine/EntityPopulator.php:242

It is safe to say I have no idea what is function is doing, but the query within it returns an empty array. I already forked the repository but the composer.json doesn't include Doctrine thus I don't know how to proceed with altering.

Just to reiterate, when directly interfacing the EntityManager my sku generation works absolutely fine

ScottA38 commented 4 years ago

Sorry to add this issue without doing the proper digging:

my sku generation works absolutely fine

...was absolutely untrue - I was checking a non-primary-key field for collisions within the repository