doctrine / migrations

Doctrine Database Migrations Library
https://www.doctrine-project.org/projects/migrations.html
MIT License
4.68k stars 388 forks source link

Migrations always generated for custom type with DBAL 4 #1441

Open michnovka opened 4 months ago

michnovka commented 4 months ago

Bug Report

Q A
BC Break no
Migrations version 3.8.0
Migrations bundle 3.3.1
ORM 3.2.1
DBAL 4.0.4
Symfony 7.1.2

Summary

I have issue with DBAL 4 and custom type, the migrations keep getting generated again and again. Furthermore, the down migration looks totally bogus. This is possibly related to #1435

I know that DBAL 4 dropped requiresSqlHint (in https://github.com/doctrine/dbal/pull/5107 , afterwards some issues were found and fixed - https://github.com/doctrine/dbal/issues/6257 )

So when I am using custom type, I expect the first migration diff to drop the DC2Type comments. However my tables have these fields already dropped and yet the migration is being generated.

enum ActionType: string
{
    case SUBMIT = 'submit';
    case CANCEL = 'cancel';
}

class ActionTypeType extends Type
{
    public function getSQLDeclaration(array $column, AbstractPlatform $platform): string
    {
        $class = ActionType::class;

        $values = [];

        foreach ($class::cases() as $val) {
            $values[] = "'{$val->value}'";
        }

        return "enum(" . implode(", ", $values) . ")";
    }

    public function convertToDatabaseValue($value, AbstractPlatform $platform): mixed
    {
        $class = ActionType::class;
        if ($value !== null && !($value instanceof BackedEnum)) {
            $value = $class::tryFrom($value);
        }else{
            return null;
        }

        return $value->value;
    }

    public function convertToPHPValue($value, AbstractPlatform $platform): ?BackedEnum
    {
        if ((!is_int($value)) && !is_string($value)) {
            return null;
        }

        return $class::tryFrom($value);
    }
}

I then have my entity as

#[ORM\Entity(repositoryClass: ActionRepository::class)]
#[ORM\Table(name: 'actions')]
class Action
{
    #[ORM\Id]
    private string $name;

    #[ORM\Column(type: ActionType::class, nullable: false)]
    private ActionType $type;
}

and in Kernel.php I set up type mapping inside Kernel::process():

$typesDefinition[ActionType::class] = ['class' => ActionTypeType::class];
$container->setParameter('doctrine.dbal.connection_factory.types', $typesDefinition);

Now I know that the types are assigned correctly, as migrations generate up like this:

    public function up(Schema $schema): void
    {
        $this->addSql('ALTER TABLE actions CHANGE type type enum(\'submit\', \'cancel\') NOT NULL');
    }

but it is generated ALWAYS.

and the down() migration looks even weirder:

    public function down(Schema $schema): void
    {
        $this->addSql('ALTER TABLE actions CHANGE type type VARCHAR(0) NOT NULL');
    }

Everything works fine with DBAL 3, which uses SQL comments

michnovka commented 4 months ago

@greg0ire I am tagging you as you authored some of the relevant changes and I know you have deep insight into removing comments in DBAL 4.

I was digging into the code, the issue lies here:

So Doctrine\Migrations\Generator\DiffGenerator::generate() calls createFromSchema to do comparison of old vs new schema. This calls Doctrine\DBAL\Schema\MySQLSchemaManager (which inherits AbstractSchemaManager) and specifically it fails on function _getPortableTableColumnDefinition(). Where the array $tableColumn which this function gets is

 array:11 [
  "table_name" => "actions"
  "field" => "type"
  "type" => "enum('submit','cancel')"
  "null" => "NO"
  "key" => ""
  "default" => null
  "extra" => ""
  "comment" => ""
  "characterset" => "utf8mb4"
  "collation" => "utf8mb4_general_ci"
  "name" => ""
]

This function then calls $type = $this->platform->getDoctrineTypeMapping($dbType) and since enum, or whatever custom type is not recognized, it defaults to StringType.

Now check DBAL 3 code: https://github.com/doctrine/dbal/blob/893417fee2bc5a94a10a2010ae83cab927e21df3/src/Schema/MySQLSchemaManager.php#L189-L192

This uses the comment to deduce the proper type and override whatever type is in DB. Without this now, DBAL considers any unknown DB type to be StringType and I have no way how to tell it its not a string type. Hence the migration will always consider old schema column to be string.

Now the new schema uses the DiffGenerator::createToSchema() which does not use Doctrine\DBAL\Schema\AbstractSchemaManager but instead Doctrine\Migrations\Provider\OrmSchemaProvider and this uses under the hood EntityManager which of course identifies the column type properly as ActionTypeType or whatever custom type you use.

So, since we removed comments in DBAL 4, how can the fromSchema be generated properly? This uses only DB values, it does not care about type definition in Entity code (thats relevant for toSchema only).

greg0ire commented 4 months ago

Please upgrade the ORM to 3.8.5: https://github.com/doctrine/dbal/pull/6423

EDIT: I shouldn't answer so hastily

michnovka commented 4 months ago

@greg0ire I am using DBAL 4.0.4 which is based on 3.8.6 that includes the linked fix. ORM is at 3.2.1 and there is no newer version. And the issue is ONLY with DBAL 4 not DBAL 3

berkut1 commented 4 months ago

@michnovka It seems that when Doctrine reads the schema of your database, the getSQLDeclaration function is not being called, which is why it cannot recognize the type. There might be a bug in https://github.com/doctrine/dbal/blob/4.0.x/src/Platforms/AbstractMySQLPlatform.php

~~Try debugging this function to determine at which stage the type is being lost: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/AbstractPlatform.php#L2190~~ Probably, this is a mistaken assumption. Here, two schemas columns are being compared. If initially it was read as StringType, then the error is somewhere earlier.

michnovka commented 4 months ago

@berkut1 this is indeed a mistaken assumption imo, as getSQLDeclaration not being called is a consequence of not having the proper type. The type has to be known in order for that function to be called. And it is called, just for the new schema.

The issue is that oldSchema is created based on database ONLY. It ignores PHP code. And thats good, as the php code defines the new schema, if it took the PHP code as a hint of the type, then migration has no point, as it would be the same type always.

I honestly see no other way than to somehow store in DB the name of the custom type which was used. Which is exactly what the DB comment was used for. WDYT?

berkut1 commented 4 months ago

@michnovka No, DC2Type comments have been useless since around DBAL 3.x. Doctrine can determine custom types on its own if they are properly registered. You need to find out why, in your case, this is not happening for Enum and where exactly the error is in the code.

michnovka commented 4 months ago

@berkut1 how are they useless when even in 3.9.x there is this code:

https://github.com/doctrine/dbal/blob/893417fee2bc5a94a10a2010ae83cab927e21df3/src/Schema/MySQLSchemaManager.php#L189-L192 ?

And this is where it fails, this is where in 3.8.x version it assigned properly custom type, but in 4.0.x it assigns StringType (as the code assigning type based on comment is missing)

And again, think about what I observed above - the oldSchema is created based on DB data ONLY. not PHP. Only DB. So if DB has no indication about the type, how can it be properly determined for nonstandard types? newSchema uses PHP code to determine types. And thats good.

Doctrine can determine custom types on its own if they are properly registered.

This has no sense for migrations. If I had customType1 field in table, and I changed it to customType2 field, then did migration, how would doctrine deduce it was before customType1? It is no longer defined and registered. PHP code always reflects only the newSchema.

berkut1 commented 4 months ago

DBAL 3.x should have backward compatibility, that why it can works with DC2Types and without. there was added this https://github.com/doctrine/dbal/blob/893417fee2bc5a94a10a2010ae83cab927e21df3/src/Platforms/AbstractPlatform.php#L108 that people can smoothly migrate from DBAL3 to DBAL4

michnovka commented 4 months ago

With DBAL 3.8.X and doctrine.dbal.disable_type_comments: true the issue is also present.

And of course it is. Read again where and why it happens, I explained exactly where the issue lies. oldSchema is created based on DB details only. If we dont use comments (either as they are removed in DBAL 4, or if we disable them manually in DBAL 3) then oldSchema will have no idea what type was used before.

Thanks for your interest and will to help.

berkut1 commented 4 months ago

Here, https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L135 and https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L218 Doctrine is trying to map your registered custom types to your database. What happens when it tries to map the Enum type in your case?

UPD: You also can check if your type registered, here: image

michnovka commented 4 months ago

https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L135

dump($type, $dbType);

gives

1 "string"
2 "enum"

and

https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L218

dump($column);

gives

Doctrine\DBAL\Schema\Column {#926
#_name: "type"
#_namespace: null
#_quoted: false
#_type: Doctrine\DBAL\Types\StringType {#636}
#_length: 0
#_precision: null
#_scale: 0
#_unsigned: false
#_fixed: false
#_notnull: true
#_default: null
#_autoincrement: false
#_platformOptions: []
#_columnDefinition: null
#_comment: ""
}

UPD: I checked and the typeRegistry does contain my custom type ActionTypeType

greg0ire commented 4 months ago

@michnovka it's unclear to me whether you know what platform-aware comparison is. In case you don't, since DBAL 3.2.0, platform-aware comparison allows to compare the generated SQL instead of comparing schemas. This means that 2 different schemas could result in an empty diff, if they result in the same SQL being generated. More on this here: https://www.doctrine-project.org/2021/11/26/dbal-3.2.0.html

Sorry if you already know this, it's genuinely hard to tell for me.

michnovka commented 4 months ago

@greg0ire yes, I get this.

The problem is that when the comparator tries to compare the schemas, it is working with oldSchema and newSchema. oldSchema is Schema instance which is created and populated with data using Doctrine\DBAL\Schema\MySQLSchemaManager and it considers my column to be StringType (cos DB column definition is enum(...) which it doesnt understand, so it says, its StringType - this can be seen in my debug output in comment https://github.com/doctrine/migrations/issues/1441#issuecomment-2207273101).

This means that when it tries to compare SQL generated between old and new schemas, it calls StringType::getSQLDeclaration output instead of the ActionTypeType::getSQLDeclaration for the old schema.

And this is expected. As if in DB there is no info about which custom type the column is, how can the MySQLSchemaManager assume its ActionTypeType and then gets its SQL from ActionTypeType::getSQLDeclaration to compare with new schema.

Now the new schema is another Schema instance, but this one is created using Doctrine\Migrations\Provider\OrmSchemaProvider and this calls internally EntityManager and that uses PHP code and properly considers the column ActionTypeType. Then new schema's SQL is the proper one, while the old ones is nonsense, as it thinks its a StringType.

greg0ire commented 4 months ago

Ah I get it now, thanks for explaining it again. I think this means the issue is that the DBAL does not understand enum(…), and cannot find a type that will produce the same DDL. I don't see a way around that. I now better get the point of https://github.com/doctrine/dbal/pull/6444, which I now realize was about the same thing.

michnovka commented 4 months ago

yes, it all comes to a simple problem - the oldSchema is generated using ONLY DB introspection. And if DB contains no info about which type was used, then it cannot call proper getSQLDeclaration function.

And no, the problem does not simply limit to enum. I can make an example with another custom type which will have the same issue without enums. ALL custom types have this issue in fact, i.e. all custom types with custom getSQLDeclaration which differs from those provided by DBAL package.

berkut1 commented 4 months ago

Here, Doctrine is reading your database. Does it also read the type as a string here? https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/AbstractSchemaManager.php#L195

leads to https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L344

greg0ire commented 4 months ago

This looks related: https://github.com/doctrine/dbal/issues/5308#issuecomment-1058391601

michnovka commented 4 months ago

Here, Doctrine is reading your database. Does it also read the type as a string here? https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/AbstractSchemaManager.php#L195

No, here it fetches it from DB like this:

   1 => array:10 [
      "TABLE_NAME" => "actions"
      "field" => "type"
      "type" => "enum('submit','cancel')"
      "null" => "NO"
      "key" => ""
      "default" => null
      "EXTRA" => ""
      "comment" => ""
      "characterset" => "utf8mb4"
      "collation" => "utf8mb4_general_ci"
    ]

But this is just one step before _getPortableTableColumnDefinition is called which is where the $this->platform->getDoctrineTypeMapping($dbType); is called, and which we already addressed and debugged in https://github.com/doctrine/migrations/issues/1441#issuecomment-2207273101

And @greg0ire this is where the issue lies: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/MySQLSchemaManager.php#L218

        $column = new Column($tableColumn['field'], Type::getType($type), $options);

specifically the Type::getType($type). Without comments, there is no way at all for the old type to be "guessed" correctly. If I use custom type which has SQL declaration always as int(3), (lets call it CustomInt3Type), then the Column instance will always be deduced as IntegerType and for comparison IntegerType::getSQLDeclaration will be called to be compared against newSchema's CustomInt3Type::getSQLDeclaration. There is just no way for it to know that this int(3) is actually CustomInt3Type and not IntegerType

berkut1 commented 4 months ago

Ah, I understand now. It calls Type::getType with enum and cannot find such a type because it doesn't exist. In Symfony, you can work around this by registering another type to let Doctrine know about it: https://symfony.com/doc/current/doctrine/dbal.html#registering-custom-mapping-types-in-the-schematool (this instruction correct for DBAL3) In your case, for DBAL 4 with Symfony, you need to do it like this:

        mapping_types:
            enum: ancestor_enum 
        types:
            ancestor_enum : 'App\Model\AncestorEnumType' 

Unfortunately, I can't help with how to do this without Symfony.

UPD: In the example, I specified the type as ancestor, assuming that you can register it in the mapping (mapping_types) only once. For all your other Enums, it will be enough to inherit from it and register only as a custom type (types). Otherwise, you would have to duplicate entries for all Enums in mapping_types.

michnovka commented 4 months ago

@greg0ire as a temporary workaround, how can I tell migration to ignore certain columns? Thanks!

greg0ire commented 4 months ago

Maybe you can use asset filtering, but I don't think it works at the column level.

michnovka commented 4 months ago

So the issue is this:

  1. we get table columns in raw format from DB using DBAL\Schema\AbstractSchemaManager::fetchTableColumnsByTable() This is an array like
    1 => array:10 [
      "TABLE_NAME" => "actions"
      "field" => "type"
      "type" => "enum('submit','cancel')"
      "null" => "NO"
      "key" => ""
      "default" => null
      "EXTRA" => ""
      "comment" => ""
      "characterset" => "utf8mb4"
      "collation" => "utf8mb4_general_ci"
    ]
  2. This is converted to DBAL\Schema\Column instance, which has incorrect type (as DBAL has no idea what custom type is to be used without comments, so it guesses best match from default types)
  3. then Column::getType()::getSQLDeclaration() is called to obtain incorrect SQL

And on the new schema, we get correct type for column and call its getSQLDeclaration()

These do not match.


Proposed solution:

Why not leverage Column::$_columnDefinition for the old schema? Lets populate this field inside AbstractSchemaManager::_getPortableTableColumnDefinition.

I dont understand why AbstractPlatform::columnsEqual() unsets this. But if we dont unset it and specify it explicitly for old schema, the comparison would be done properly

@greg0ire WDYT?

greg0ire commented 4 months ago

Sounds good, feel free to give it a try, and see if anything breaks 👍

michnovka commented 4 months ago

I tried to go this way and I am afraid this would create tons of unexpected issues. There are specific test cases that ensure that

Schema diff is empty, since only columnDefinition changed from null (not detected) to a defined one

AbstractComparatorTestCase::testWillNotProduceSchemaDiffOnTableWithAddedCustomSchemaDefinition

so for whatever reason this is considered important. Even the comment in the AbstractPlatform::columsEqual talks about this specifically to ignore the column definition.

So maybe a solution would be to add comments back so that the type can be deduced properly?

greg0ire commented 4 months ago

I don't think that's the direction we want to go, no. Re-reading the cookbook you mentioned I see

In this case however Schema-Tool update will have a hard time not to request changes for this column on each call.

It seems like that might have been fixed in https://github.com/doctrine/dbal/pull/5224, which might make solution 1 the way to go here.

berkut1 commented 4 months ago

@michnovka I don't know why you ignored my solution https://github.com/doctrine/migrations/issues/1441#issuecomment-2207383215. For reference, I had a very similar problem, not with enum, but with the inet type, which is unknown to Doctrine. It tried, as in your case, to convert it to VARCHAR in the down() method and constantly generated diff migrations.

As I mentioned in my comment on the instructions, I solved the problem for myself in this way:

doctrine:
    dbal:
        mapping_types:
            inet: my_inet

        types:
            my_inet: { class: 'App\Model\InetType' }

I want to clarify, as stated in the instructions at https://www.doctrine-project.org/projects/doctrine-orm/en/3.2/cookbook/mysql-enums.html, it is recommended to do it like this

$conn->getDatabasePlatform()->registerDoctrineTypeMapping('enum', 'string');

but this solution works only for DBAL3. For DBAL4, you need to refer not to string but to your newly registered custom type, as I indicated in my previous comment.

However, if you haven't tried this solution at all, you can first refer to string; it might work with enum.

michnovka commented 4 months ago

@berkut1 Thanks for the suggestion, but this is not a solution for my case.

It works in your case when you have 1:1 mapping between DB type INET and your class MY_INET. So your code is 100% certain, when it finds INET type in DB to use that specific type.

But in my case, every ENUM is different. The example I provided is ENUM('submit','cancel') and it uses custom type ActionTypeType extends AbstractEnumType. AbstractEnumClass is my base class for enums, and I have many others. There is e.g. ColorEnumType extends AbstractEnumType which creates ENUM('black','red'). So DB ENUM type maps to multiple PHP Type classes.

berkut1 commented 4 months ago

@michnovka

Yes, I understand, which is why I initially suggested creating an abstract Enum class and registering it, then inheriting from it for custom types. If I understand correctly how Doctrine works with types, when comparing schemas, it only compares the types of SQL, and the comparison of ColumnDeclarationSQL happens only at the final stage: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/AbstractPlatform.php#L2199-L2204

In other words, you need to make Doctrine understand that it has a base Enum class for comparison, with which it will compare all the descendants of this class. This test for DBAL3 with disabled type comments, show the logic: https://github.com/doctrine/dbal/blob/893417fee2bc5a94a10a2010ae83cab927e21df3/tests/Platforms/AbstractPlatformTestCase.php#L1448

michnovka commented 4 months ago

@berkut1 this does not work for the reason I mentioned. The same way that for doctrine VARCHAR(16) and VARCHAR(65) are both the same StringType, it considers ENUM('black','red') and ENUM('fast','slow') as the same type. Since the type has no idea about which enum to use as a base.

So what I can do, is map ENUM(WHATEVER) to some "abstract" Enum class (notice the quotes around abstract, as the class cannot be abstract, as it has to be instantiated).

But then, how can this "abstract" enum class generate valid SQL code in getSQLDeclaration, and how can it cast properly to the actual specific PHP enum in convertToPHPValue ? So the approach I took, is I do actually have an AbstractEnumType class and I extend this for every single PHP enum I want to use as MySQL ENUM. This way every column that is mapped to this type is actually an instance of some child of AbstractEnumType but it is a different child for every enum I use, therefore it performs proper casting and can create the appropriate SQL declaration.

And this is where we come to the issue which I am not sure how to address with some easy fix. ENUM('black','red') and ENUM('fast','slow') should be treated as different types.

The only solution is to compare the SQL read from INFORMATION_SCHEMA with the SQL generated for a new schema.

Now we do:

  1. read INFORMATION_SCHEMA from DB and populate old schema's Column using DBAL's SchemaManager<- this is where it fails, as the Type field of old schema is not 100% correct with custom types
  2. populate new schema's Column using Migration's DiffGenerator::createToSchema <- this is correct, as it reads types properly from attributes
  3. We compare the 2 columns <- they always differ as the old schema cannot know the proper type to use.

I tried to implement https://github.com/doctrine/migrations/issues/1441#issuecomment-2208783407 leveraging Column::$_columnDefinition but this is not OK, it creates tons of failing tests because this is not what the field was meant for.

michnovka commented 4 months ago

@greg0ire Idea for workaround:

How about we implement an attribute #[Doctrine\Migration\Mapping\IgnoreInMigrations] and if we use it on a type, it will ignore the column, or potentially whole Entity? If new schema has a column/entity with this attribute, it will simply not care if the old schema has it/does not have it/has it different. Simply skip any check.

This is different than DBAL's schema_filter, since that one filters out tables that are present in the database.

Migration bundle can then show a warning that some columns/tables were ignored. And its up to the developer to manage such fields in migrations manually.

PowerKiKi commented 4 months ago

I am having the exact same issue as @michnovka. I have a bunch of different PHP enums, each of them is used to declare a DBAL type enum via inheritance of a custom PhpEnumType DBAL type. That parent will use ENUM type in DB, instead of the VARCHAR used by the column attribute enumType. So it's basically https://knplabs.com/en/blog/how-to-map-a-php-enum-with-doctrine-in-a-symfony-project/

But now migrations are always re-generated for those columns :-/

@michnovka, the workaround you suggest seems like quite a big regression feature-wise. Having auto-managed enums is really great. And there is no reason that Doctrine should not treat ENUM differently than any other types. It should be supported as well as other types. And if not out-of-the-box, at least Doctrine should let us the possibility to implement them with reasonable effort.

@greg0ire I get it that DBAL does not require comments anymore for its out of the box behavior. Doctrine is improving, that's good for all of us. But it does not mean that comments don't have any real use outside of DBAL internals, to solve real use-cases.

The fact that the removal of comment broke the cookbook to use ENUM in DB is very telling. You suggest to fallback on the cookbook that use VARCHAR instead. It might work for some, but it might also be unacceptable for some project to change all column type from ENUM to VARCHAR.

I think this whole thing boils down to: how can Doctrine supports ENUM as a first-class citizen, and not what seems to only be an afterthought ?

PS: but https://github.com/doctrine/dbal/issues/5308#issuecomment-1058391601 suggests that there is a broader issue that goes beyond ENUM too...

greg0ire commented 4 months ago

to change all column type from ENUM to VARCHAR.

Can you clarify exactly what you mean by that?

how can Doctrine supports ENUM as a first-class citizen, and not what seems to only be an afterthought ?

It would seem like it cannot

PowerKiKi commented 4 months ago

to change all column type from ENUM to VARCHAR.

Can you clarify exactly what you mean by that?

What I mean that is that before DBAL 4.0, the cookbook solution 2 allowed to have a DB type of ENUM.

But now it will re-generate migrations everytime. So it appears the only way to avoid that, is to use the built-in support for PHP enum, via either the enumType attribute, or the implicit typed field mapper, as shown below. Because I don't think the cookbook solution 1 can work with PHP enum without unreasonable boilerplate. But both of those solutions will change the DB colum from ENUM to VARCHAR, and thus losing the advantages of DB enum. And our DB will become a mess of non-validated data.

Given this entity:

#[ORM\Entity]
class Task
{
    #[ORM\Column(type: 'Priority')]
    private Priority $priorityViaCustomType;

    #[ORM\Column(enumType: Priority::class)]
    private Priority $priorityViaEnumType;

    #[ORM\Column]
    private Priority $priorityViaTypedFieldMapper;
}

I will get the following SQL for MariaDB:

ALTER TABLE task
ADD priority_via_custom_type ENUM('low', 'normal', 'high') NOT NULL COMMENT '(DC2Type:Priority)', -- DBAL 4.0 won't have the comment anymore, so it will generate migrations everytime 😭 
ADD priority_via_enum_type VARCHAR(255) NOT NULL, -- Not validated by DB 😭
ADD priority_via_typed_field_mapper VARCHAR(255) NOT NULL; -- Not validated by DB 😭
berkut1 commented 4 months ago

@PowerKiKi If your database has dependencies on the ENUM attribute type, which affect data consistency, doesn't that indicate you have database architecture issues? I think you won't be able to convince Doctrine contributors to change their opinion about ENUM unless you take on the monumental task of implementing ENUM support for all the database systems Doctrine works with.

PowerKiKi commented 4 months ago

You are correct, Doctrine won't change their opinion, and that's fine by me. However they could make it possible to implement it with custom code, and they should definitely strive not to break use-cases, as long as the maintenance cost is reasonable.

At best, dropping comments generated a lot of confusion, or at worse it broke some use-cases as illustrated by the amount of activity around that topic:

And it is not only limited to enum, according to https://github.com/doctrine/dbal/issues/5308#issuecomment-1058391601

It is a bit frustrating because there have been people warning of the situation very early.

berkut1 commented 4 months ago

You mentioned my issue as well :) Actually, I welcome the new type definition system because it solved (at least for me) many non-obvious errors that DC2Type was hiding by disabling part of the code. I have more of a question about the remnants of DC2Type, such as inet -> string, causing DBAL3 to ignore the schema and assume that INET = VARCHAR and left INET in migrations. However, in DBAL4 the logic changed, but the inet -> string rule remained, which seems strange and will be a surprise for many when they find out that all INETs have become VARCHARs. It would be more appropriate to remove inet -> string (and other similar types) altogether so that an exception is thrown and people can create a custom type themselves.

However, I don't understand why you can't do something similar for ENUM. I don't see a reason why it shouldn't work, unless if you're using third-party packages.

greg0ire commented 4 months ago

But both of those solutions will change the DB colum from ENUM to VARCHAR

:thinking:

Then why does the cook book solution 1 mention enums? columnDefinition: "ENUM('visible', 'invisible')"

PowerKiKi commented 4 months ago

columnDefinition: "ENUM('visible', 'invisible')" will "not work", meaning that it will keep generating migrations. Below I force to update the DB structure, then immediately ask for a diff. This should always be empty, but it incorrectly think that my field is not the same:

#[ORM\Column(type: "string", columnDefinition: "ENUM('visible', 'invisible')")]
private $myTestField;
$ ./bin/doctrine orm:schema-tool:update --force && ./bin/doctrine orm:schema-tool:update --dump-sql | grep task
 Updating database schema...

     11 queries were executed

 [OK] Database schema updated successfully!                                                                             

ALTER TABLE task CHANGE my_test_field my_test_field ENUM('visible', 'invisible');    <=== INCORRECT

So that solution has no advantage over solution 2, because both generate infinite migrations. And it is very difficult to scale, and possibly requires a lost of copy/paste, since PHP attributes cannot use expressions.

So if we are stuck with infinite migrations anyway, I might as well use solution 2, which is easy to use and scale much better.

So the only choices we have are:

It's a rather annoying regression in both cases in my opinion

greg0ire commented 4 months ago

columnDefinition: "ENUM('visible', 'invisible')" will "not work", meaning that it will keep generating migrations.

Weird, because I stumbled upon https://github.com/doctrine/dbal/pull/5224 recently, which is supposed to prevent that.

berkut1 commented 4 months ago

Weird, because I stumbled upon https://github.com/doctrine/dbal/pull/5224 recently, which is supposed to prevent that.

I suspect it's the same problem I tried to describe here using INET as an example. Apparently, I'm not very good at explaining things :)

berkut1 commented 4 months ago

@PowerKiKi @michnovka If you are using ORM, could you test a theory? Remove this check (or only 464 line) and see if it helps with ENUM. https://github.com/doctrine/orm/blob/9b09cd03c0f4577174dbf7af92644fa8f78faf64/src/Tools/SchemaTool.php#L463-L465

Especially if you are using this solution - https://www.doctrine-project.org/projects/doctrine-orm/en/3.2/cookbook/mysql-enums.html#solution-1-mapping-to-varchars

meiyasan commented 3 months ago

@berkut1 @greg0ire I have been facing similar problem in dbal#6444 ; using DBAL/ORM last released version. I just quickly dug out a project implementing a custom EnumType class declaring a string column and using ENUM() declaration in db. I just tried to remove the line you mentioned and the migration issue still occurs.

I still can't get back the proper object neither. As far as my investigation went, this might perhaps work for pure PHP builtin enums; but still custom types would always be failling in my opinion: such as set types, or any other custom type requiring COMMENT information to translate back. I cannot see any other solution but making use of COMMENT and DC2Type declaration at the moment, but I am not an expert. Please advice.

berkut1 commented 3 months ago

@xkzl what diff migration did you get after removing the line 464?

PowerKiKi commented 3 months ago

@PowerKiKi @michnovka If you are using ORM, could you test a theory? Remove this check (or only 464 line) and see if it helps with ENUM. https://github.com/doctrine/orm/blob/9b09cd03c0f4577174dbf7af92644fa8f78faf64/src/Tools/SchemaTool.php#L463-L465

Especially if you are using this solution - https://www.doctrine-project.org/projects/doctrine-orm/en/3.2/cookbook/mysql-enums.html#solution-1-mapping-to-varchars

@berkut1, I tried that too. My mapping is not valid anymore, and I get the exception:

Doctrine\DBAL\Exception\InvalidColumnDeclaration: Column "role" has invalid type
[...]

Caused by
Doctrine\DBAL\Exception\InvalidColumnType\ColumnLengthRequired: Doctrine\DBAL\Platforms\MariaDB1060Platform requires the length of a VARCHAR column to be specified
berkut1 commented 3 months ago

@PowerKiKi I see, different platforms have their own logic here. In your case, it doesn’t make sense to remove this check. My idea was that when checking the SQL declaration, the same values should be compared.

https://github.com/doctrine/dbal/blob/4.0.x/src/Platforms/AbstractPlatform.php#L2199-L2202

You can read more about it here. https://github.com/doctrine/dbal/issues/6468#issuecomment-2242324006

As I see in the first message of this issue, the migration function down() represents ENUM as VARCHAR(0), so I assumed that maybe by default, DBAL sets the string length as 0 for MySQL/MariaDB and expects the same from ORM, but ORM assigns a standard length of 255 to any strings.

Overall, take a look at which strings are being compared here and how DBAL represent ENUM there: https://github.com/doctrine/dbal/blob/4.0.x/src/Platforms/AbstractPlatform.php#L2199-L2202

PowerKiKi commented 3 months ago

I setup https://github.com/PowerKiKi/doctrine-migrations-issues-1441, a repository demonstrating that none of the approaches suggested throughout the entire discussion are working anymore. Feel free to create PR over there to test new approaches.

@berkut1 (and to a lesser extent @greg0ire), I think you are missing the point that DBAL issue is before the comparison step, as you suggested by mentioning AbstractPlatform::columnsEqual(). The issue is when we read the database structure. DBAL will introspect the DB and find "enum('visitor','member','admin')". But then it will incorrectly truncate the type down to only "enum". So DBAL effectively assumes that "enum('visitor','member','admin')" and "enum('new','active','archived')" are the same type. But they clearly are not the same and are absolutely not interchangeable.

From that incorrect assumption it all goes downhill and $this->platform->getDoctrineTypeMapping($dbType) will incorrectly return the first (or last? :shrug:) custom PHP type that declared it is mapping to "enum" via its \Doctrine\DBAL\Types\Type::getMappedDatabaseTypes().

The core issue really is that: DBAL introspection incorrectly simplifies database types and mixes up unrelated database types. Then of course custom PHP types cannot be reliable, if the foundation is broken. We should be able to use getMappedDatabaseTypes() to declare that the type maps to a specific enum, eg: "enum('visitor','member','admin')". But we cannot.

In DBAL < 4.0 we were able to workaround that by using column comment in database, because it essentially completely bypassed auto-detection by "hardcoding" which custom PHP type it matches.

Since that was dropped, we don't have any way to keep the behavior pre-4.0. I am only paraphrasing https://github.com/doctrine/migrations/issues/1441#issuecomment-2207348574 here, until here.

I feel we have two solutions going forward:

IMHO, the first solution is the proper one, but it also means it might break a lot more existing code. At least all enums and all sets and possibility other things too (depending on how well we code :sweat_smile:).

The second solution is the easiest and safest one. Restoring the code, and restoring concepts that everybody knows, and unbreaking existing code without any changes in user code. That might be less proper, but I'm inclined to still opt for that, because of the immediate simplicity and benefits.

All of this also applies to SET. And whichever solution we pick should also work for SET.

tl;dr: "enum('visitor','member','admin')" != "enum('new','active','archived')"

berkut1 commented 3 months ago

@PowerKiKi Okay, now I understand your point of view. It’s true that it doesn’t check the actual ENUM values, but for many custom types, this was also how it worked in DBAL3. In fact, it never really worked properly, and people used various hacks to get around it, like manually editing migrations.

It might be better to completely remove ENUM from DBAL, along with other similar types that no longer work because DBAL and ORM have different views on the default VARCHAR value, and because comment hints are no longer available.

PowerKiKi commented 3 months ago

remove ENUM from DBAL

ENUM does not exist in DBAL. There is no code that specifically mention that (except one single occurence in test). So there is nothing to remove really.

But I am a bit shocked that you suggested it. We've been asking to keep a way for us to implement our own support for ENUM, and you suggest the exact opposite. I feel I am not being heard at all here. I get it that DBAL has no interest to support it out-of-the-box. That's fair. But it feels like you are actively trying to prevent us to achieve our goal, whereas there could be a reasonable compromise to be made, but you don't seem to consider them seriously at all.

I would like to hear the opinion of other people too. Someone with much more insights that the both of us, such as @greg0ire or @derrabus ? would you accept a PR that avoid truncating type declaration for ENUM and SET ?

berkut1 commented 3 months ago

But I am a bit shocked that you suggested it. We've been asking to keep a way for us to implement our own support for ENUM, and you suggest the exact opposite

@PowerKiKi Not exactly. As far as I understand, the main contributors are strongly against bringing back comment hints, or something similar. If they're against it, then why keep non-functional pieces of code at all?

michnovka commented 3 months ago

@berkut1 you cannot remove ENUM from DBAL4 bacause it simply is not there. What you can remove is an example on docs for how to make custom ENUM() column. While it still works to create the DBAL schema, it will fail with migrations.

@PowerKiKi I think this thread has the underlying issue described to the very detail:

The issue is that oldSchema is created based on database ONLY. It ignores PHP code. And thats good, as the php code defines the new schema, if it took the PHP code as a hint of the type, then migration has no point, as it would be the same type always.

Any type which is not natively supported by DBAL will face this issue. We have a one-way function from PHP code -> DB representation, and we have no way to reverse this with accuracy, because

  1. multiple PHP representations can lead to the same DBAL type
  2. only default built-in DBAL types can be compared in migrations

With ENUMs specifically it is further complicated by the fact that basically every ENUM is a different type. DBAL was not meant to handle such case. It can handle VARCHAR(16) != VARCHAR(63) but this is because length is a valid property of Column. The list of valid properties is:

There is no way to specify ENUM cases. So the best we could get with current DBAL is ENUM(1,2,3) = ENUM('a','b','c').

But that still assumes that ENUM would be a valid type in DBAL, which it is not (and there are no plans to make it so, because it is not abstract enough. Many DBMS do not support this. Its really a MySQL-family thing)

I do understand the reluctancy to bring back comments, but when the decision was made, I do not believe enough thought was given to the real implications. This is not just ENUMs. This is ANY nonstandard DBAL types. INET, SET etc. Exotic ones, yes, but 4.x breaks them.

While I agree to discourage from using typehints, I think this option should be brought back for the few cases, where without them 4.x is broken without a fix where 3.x worked just fine.