doctrine / dbal

Doctrine Database Abstraction Layer
https://www.doctrine-project.org/projects/dbal.html
MIT License
9.4k stars 1.33k forks source link

Type Mapping Discrepancies Between DBAL3 and DBAL4 #6466

Open berkut1 opened 1 week ago

berkut1 commented 1 week ago
Q A
Version 4.0.x

This PR https://github.com/doctrine/dbal/pull/6463 and this topic https://github.com/doctrine/migrations/issues/1441 have highlighted a peculiar behavior.

In this case, let's consider the inet type, which was added to the PostgreSQL platform a long time ago - https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/PostgreSQLPlatform.php#L706

After some tests, I noticed that in DBAL3 and DBAL4 they behave strangely. I suspect that inet was added there to work around an issue because Doctrine did not recognize this type. That is, if we have such a migration in DBAL3:

CREATE TABLE test_inet (
                id INT GENERATED BY DEFAULT AS IDENTITY NOT NULL,
                ip_inet INET NOT NULL, 
                PRIMARY KEY(id)
);

And such an entity:

#[ORM\Table(name: "test_inet")]
#[ORM\Entity]
class TestInet
{
    #[ORM\Id]
    #[ORM\Column(name: "id", type: Types::INTEGER, nullable: false)]
    #[ORM\GeneratedValue(strategy: "IDENTITY")]
    private ?int $id = null;

    #[ORM\Column(name: "ip_inet", type: Types::STRING, nullable: false)]
    private string $ipInet;
}

Then DBAL3 will ignore the fact that the migration type is INET, deciding that it is VARCHAR and will keep the type as INET. However, in DBAL4, this trick no longer works, and we have to redefine the type as follows:

doctrine:
    dbal:
        mapping_types:
            inet: override_inet

        types:
            override_inet: { class: 'App\Entity\InetType' }
#[ORM\Table(name: "test_inet")]
#[ORM\Entity]
class TestInet
{
    #[ORM\Id]
    #[ORM\Column(name: "id", type: Types::INTEGER, nullable: false)]
    #[ORM\GeneratedValue(strategy: "IDENTITY")]
    private ?int $id = null;

    #[ORM\Column(name: "ip_inet", type: 'override_inet', nullable: false)]
    private string $ipInet;
}

So, my question, what function does inet perform in initializeDoctrineTypeMappings() or other similar custom types? And isn't this a bug? Or am I using or understanding it incorrectly?

Therefore, I afraid that if the PR https://github.com/doctrine/dbal/pull/6463 is applied to DBAL4 in the same way, it will not work correctly and will cause the same problems.

Thanks.

berkut1 commented 1 week ago

The same problem exists with the real type, but it is even worse. https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/PostgreSQLPlatform.php#L720

While we can override inet, real is read as float4 and complete_type as real (which DBAL doesn't use) from here: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/PostgreSQLSchemaManager.php#L412

It is subsequently defined as float: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/PostgreSQLSchemaManager.php#L275

Then, as float, it is mapped to Types::FloatType, which translates to DOUBLE PRECISION: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/AbstractPlatform.php#L1869-L1872

However, DOUBLE PRECISION is actually float8. As a result, any float type will be defined as double precision without any way to change it.

But then, why are these cases: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/PostgreSQLSchemaManager.php#L318 and also initializeDoctrineTypeMappings() here if they will never work?

UPD:

Sorry, actually it is possible to override, but you need to override float4, not real:

doctrine:
    dbal:
        mapping_types:
            float4: base_real

        types:
            base_real: { class: 'App\Entity\BaseRealType' }
            wide_type: { class: 'App\Entity\WideType' } #inherit from class base_real
            depth_type: { class: 'App\Entity\DepthType' } #inherit from class base_real

However, in that case, the existence of real type in the code doesn't make any sense to me.

greg0ire commented 1 week ago

You've expressed in https://github.com/doctrine/migrations/issues/1441#issuecomment-2225047882 that you maybe did not explained things very well, and I after reading this issue several times, I'm indeed super confused. Let me try to list the things that confuse me here, hopefully we can get something out of it.

After some tests, I noticed that in DBAL3 and DBAL4 they behave strangely.

OK? What is the strange behavior?

I suspect that inet was added there to work around an issue because Doctrine did not recognize this type.

What do you mean by "recognize"?

the migration type

What's a migration type?

deciding that it is VARCHAR

Where/how does it do that?

will keep the type as INET

Hang on. Didn't you just said Doctrine decide the type was VARCHAR?! Super confusing

this trick no longer works

What did "this trick" achieve, and how can you tell it no longer works?

Then you quote this:

doctrine:
    dbal:
        mapping_types:
            inet: override_inet

        types:
            override_inet: { class: 'App\Entity\InetType' }

What is the DBAL equivalent of this DoctrineBundle definition? What does App\Entity\InetType look like?

complete_type

What is complete_type? Never heard of that.

berkut1 commented 1 week ago

OK? What is the strange behavior?

A few years ago, if Doctrine didn't support a certain type but you wanted it to, and also wanted Doctrine migrations to work with your desired type, you would add an entry like this:

doctrine:
    dbal:
        mapping_types:
            cidr: string

(CIDR is a PostgreSQL type - https://www.postgresql.org/docs/current/datatype-net-types.html) This wasn't in the documentation, but I often came across this implementation in discussions, which I think was based on this line of code: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/PostgreSQLPlatform.php#L706 And in DBAL3 it actually worked. You would add an entry like unknown_type: string, and Doctrine would handle this type as a string while not changing the SQL migration, avoiding attempts to convert the unknown type to VARCHAR. So, if we used the SQL attribute type CIDR, Doctrine would not change it to VARCHAR (if your DB schema already with CIDR).

What do you mean by "recognize"?

If Doctrine doesn't support a type, it throws the following error:

Unknown database type SOMETYPE requested Doctrine\DBAL\Platforms\

This can be resolved if you add an entry like this

doctrine:
    dbal:
        mapping_types:
            new_type: string

or create a PR like this https://github.com/doctrine/dbal/pull/6463

What's a migration type?

Sorry, I meant SQL attribute types in table, for example, the SQL INET data type in PostgreSQL.

Where/how does it do that?

Hang on. Didn't you just said Doctrine decide the type was VARCHAR?! Super confusing

I didn't try to find the cause and compare DBAL3 and DBAL4 to see where the difference happens, because I was trying to ask if this behavior is normal for DBAL4.

What did "this trick" achieve, and how can you tell it no longer works?

I don't know how else to explain it since I already gave an example. Alright, I'll try again :)

In ORM with DBAL3, an entry like this:

#[ORM\Column(name: "ip_inet", type: Types::STRING, nullable: false)]
private string $ipInet;

won't try to generate a new migration. So, if our DB schema looks like this:

CREATE TABLE tbl (
     ip_inet INET NOT NULL
);

It will stay unchanged. But in DBAL4, it will generate a migration and change INET to VARCHAR. So, the trick in DBAL4 no longer works.

What is the DBAL equivalent of this DoctrineBundle definition? What does App\Entity\InetType look like?

it is equivalent to: $conn->getDatabasePlatform()->registerDoctrineTypeMapping(); to mapping_types: Type::addType to types: Isn't it?

App\Entity\InetType it's just a custom type with:

    public function getSQLDeclaration(array $column, AbstractPlatform $platform): string
    {
        return 'INET';
    }

What is complete_type? Never heard of that.

This is a property from the SQL query for PostgreSQL, based on which Doctrine determines which data types are used in table attributes. https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Schema/PostgreSQLSchemaManager.php#L424 But that's not important and is unrelated to the main question in this issue. I mentioned it earlier because some lines of code confused me.

greg0ire commented 1 week ago

Ok… this is quite long and goes in many directions, but I think if I had to sum it up, I would say the following:

Now that platform-aware comparison is the only way to go, is there a point in having a type mapping that maps e.g. the inet database type to the string DBAL type?

Is that what this issue is about?

If yes, personally, I think it might be just a remnant of legacy DBAL versions. If I remove that line and run the tests, they do not fail. I'd be interested to know what other maintainers think about this. @morozov @derrabus , maybe you know more about this?

Now, regarding the issue about the wrong diff, it feels like the part that is not taken into account is the Type::addType part, or the part that uses InetType in the new schema, because if the issue had to do with schema introspection, I think it would generate a migration from VARCHAR to INET (even if you were already using INET), right? You might want to use your debugger here, to check if InetType is used in the new schema or not.

berkut1 commented 1 week ago

No, no, you misunderstood :) The solution of creating a custom type (InetType) and mapping it works as it should, there are no issues there.

The problem is that previously with DBAL3, we could simply tell Doctrine that an unknown type is a string (as it exists with inet -> string), and Doctrine, when comparing columns, wouldn't attempt to change them to VARCHAR but would leave the unknown type in the schema (if you already use the unknown type).

I brought up this topic in the ENUM discussion because there's a similar approach where we define an entity property as enum but use string type. In DBAL3, this worked fine, but in DBAL4, it now changes to VARCHAR. At first glance, I see a similar issue here, but perhaps it's not the same.

You might want to use your debugger here

I can debug later to pinpoint where exactly the differences occur.

morozov commented 6 days ago

So, my question, what function does inet perform in initializeDoctrineTypeMappings() or other similar custom types?

No idea.

greg0ire commented 6 days ago

OK :(

2 things:

  1. The mapping might be completely useless now.
  2. Regardless, I think what you have done @berkut1 (overriding it, and adding a new type) should work.

Let's focus on 2. for now, and if we find a solution, then we might consider deprecating and removing the supposedly useless mappings.

berkut1 commented 6 days ago

@greg0ire

Very interesting, it turns out the reason is related to my previous issue https://github.com/doctrine/orm/issues/11502. Previously, due to a bug, DBAL3 didn't distinguish between VARCHAR and VARCHAR(255), which always made this true: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/AbstractPlatform.php#L2199-L2204

To explain in more detail: An entry like inet -> string masks the INET type as VARCHAR.

This entry, says that the data type is expected to be VARCHAR(255):

#[ORM\Column(name: "attr_inet", type: Types::STRING, nullable: false)]
private string $attr_inet;

In DBAL3, such "masking" would allow the INET data type to remain in the schema because for DBAL3, it was considered VARCHAR. But in DBAL4, it can distinguish the length of VARCHAR, so the "masking" no longer works and breaks these kinds of tricks.

So, it turns out the problem is in ORM, and indirectly in DBAL4 (because here was fixed an old bug)? :) And as I understand, we can't remove the hardcoded check with 255 string in ORM because it would force everyone to specify the length of VARCHAR. In that case, maybe it's better to remove all types that are masked as strings?

greg0ire commented 5 days ago

Hang on, so it's not using the SQL declaration of your INET type?

https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/AbstractPlatform.php#L1372

EDIT: oh right, for some reason you're using Types::STRING in your entity now :thinking:

maybe it's better to remove all types that are masked as strings

Do you mean it fixes the issue when you remove the mapping?

masked

Do you mean mapped?

In any case, yes I think that map should not map many database types to the same DBAL type, unless the schema manager code is able to adjust DBAL type options so the database type can be found back when going the other way.

berkut1 commented 5 days ago

@greg0ire Yes, the idea is to trick DBAL without creating a separate custom type class. You might be surprised that the Symfony documentation suggests the same solution. https://symfony.com/doc/current/doctrine/dbal.html#registering-custom-mapping-types-in-the-schematool

EDIT: oh right, for some reason you're using Types::STRING in your entity now

Yes, exactly. This way, in DBAL3 + ORM, developers tricked DBAL to use an unknown data type without creating and registering a custom class. To be precise, DBAL started this first by adding this line: https://github.com/doctrine/dbal/blob/90424473eb144659a89fb1b5c9bca37297085351/src/Platforms/PostgreSQLPlatform.php#L706

Do you mean it fixes the issue when you remove the mapping?

No, that's a drastic solution. The Doctrine team needs to decide if it's okay to trick DBAL or not. If not, then all these types should be removed, and developers should create them using custom types without using tricks. So, the right solution for unknowns type should be in that way:

doctrine:
    dbal:
        mapping_types:
            inet: override_inet

        types:
            override_inet: { class: 'App\Entity\InetType' }

Do you mean mapped?

Yes and no. It's more like masking a certain type as a string, with the intention of tricking Doctrine into not changing the schema, leaving the unknown type in the table schema.

UPD: The same trick is used to support REAL types (but with float), which causes some similar bugs. That's why I made a PR for full support of this type. https://github.com/doctrine/dbal/pull/6467

greg0ire commented 5 days ago

My personal opinion is that we should stop tricking DBAL, given how much confusion this causes, and given we no longer support reverse-engineering (the only use case I know for introspecting the database is diffing). I'd rather have the community create extra packages supporting extra types and automating the solution you showed above.

I'd like to know what other maintainers think.