cakephp / phinx

PHP Database Migrations for Everyone
https://phinx.org
MIT License
4.47k stars 891 forks source link

Is there any way to run seeds just once per class? #1922

Open Marcin-TA opened 4 years ago

Marcin-TA commented 4 years ago

Hi,

I noticed that when I run seeds, Phinx doesn't check if the seeds have been already run. Phinx just runs it again and again ...n. Is there any reason why Phinx does do it? If I won't add e.g. new record to my table then I would like to add just one record, not multiple records. It works totally opposite to migration.

dereuromark commented 4 years ago

Seeding is demo data import IMO. As such, providing here also "migration history" seems usually overkill.

See https://github.com/cakephp/phinx/issues/981

dereuromark commented 4 years ago

I think similar points have been raised before, and there is some merit to it. Having the data versioned more easily along with the current time in structure would sure help. Right now it is always a snapshot of demo data on the very last state of migrations And you can easily forget those when adding migrations. At some point they might stop working then.

As for the duplication: I think there is an easy way to solve this with the current way fixtures work:

Besides the existing hasTable() method, we could provide hasData() and based on that you could return early:

    public function run()
    {
        if ($this->hasData()) {
            return; // Records > 0 found in DB, demo data available
        }

        ...

Or with specific data maybe as well:

    public function run()
    {
        if ($this->hasData(['key' => 'my-key'])) {
            return; // Records > 0 found in DB for this key, this demo data available
        }

        ...

If we didn't want such specific methods, we could also add an Interface OnlyOnceInterface or alike and those seeds would internally check then if there is already demo data in those tables and skip if so.

What do people think?

Marcin-TA commented 4 years ago

I would keep the same logic which has migration. So if the seeds were added then store a record on phinxlog table, but with seed file name and DateTime. That's all.

Marcin-TA commented 4 years ago

I think similar points have been raised before, and there is some merit to it. Having the data versioned more easily along with the current time in structure would sure help. Right now it is always a snapshot of demo data on the very last state of migrations And you can easily forget those when adding migrations. At some point they might stop working then.

As for the duplication: I think there is an easy way to solve this with the current way fixtures work:

Besides the existing hasTable() method, we could provide hasData() and based on that you could return early:

    public function run()
    {
        if ($this->hasData()) {
            return; // Records > 0 found in DB, demo data available
        }

        ...

Or with specific data maybe as well:

    public function run()
    {
        if ($this->hasData(['key' => 'my-key'])) {
            return; // Records > 0 found in DB for this key, this demo data available
        }

        ...

If we didn't want such specific methods, we could also add an Interface OnlyOnceInterface or alike and those seeds would internally check then if there is already demo data in those tables and skip if so.

What do people think?

TBH, It doesn't really look like a relay solution. I want to use Phinx for migration and seeds in staging and production servers. Which mean I will have n migration and seed files. I don't want to check these condition every time I run seeds. Especially, if there will be multiple seeds files.

Marcin-TA commented 4 years ago

Also, is there any way to rollback seeds?

dereuromark commented 4 years ago

As I said: The seeds as they are were never meant to have this functionality. I just proposed a way to make those demo data seeds a bit less annoying to use.

What you want right now is - which was said before - a combination of the data into the actual migrations. So you can do that all right now, by using the migrations themselves I also sometimes do that when I need to migrate data, just executing queries directly along with the structure changes. Even raw SQL is possible:

    public function up() {
        $content = file_get_contents(dirname(__FILE__) . '/' . '20150117000256_sql.sql');
        $this->query($content);
    }

For rollbacks you just need the opposite available.

nook24 commented 1 year ago

I thing it would be very useful if Seeds could not only insert data, but also update existing records.

In my case we do not use Seeds to insert any Demo data, we use it to add initial records into the database. We had to add logic into our Seeds to be able to make sure all expected records exists, to updating existing records or to create new ones.

We are running automated updates on more than 600 instances using this script. It is not possible to do this with a workaround such as

$content = file_get_contents(dirname(__FILE__) . '/' . '20150117000256_sql.sql');

This would cause all sorts of issues, starting with different auto increment values when you have to insert records with relations.

So if the seeds were added then store a record on phinxlog table, but with seed file name and DateTime. That's all.

I didn't like this idea because it can not make sure that an expected record really exists. This methods works fine for the database schema, as a user (normally) cannot modify it, but it will not work for the data itself.

Probably my use case is an edge case. I do not expect that CakePHP or Phinx will implement this in a near future, but a more feature complete seeding method would be nice.

This is one of my Seeds, if someone is interested in this: https://github.com/it-novum/openITCOCKPIT/blob/development/config/Seeds/InstallSeed.php

Seeders are dumb by design and are supposed to run repeatably.

Originally posted by @robmorgan in https://github.com/cakephp/phinx/issues/981#issuecomment-260929344

Sort of. They can not run repeatedly as they will throw a duplicate entry 'n' for key primary.

Just my two cents :)