doctrine / orm

Doctrine Object Relational Mapper (ORM)
https://www.doctrine-project.org/projects/orm.html
MIT License
9.92k stars 2.51k forks source link

Fix BIGINT validation #11414

Closed derrabus closed 5 months ago

derrabus commented 5 months ago

Fixes #11377.

Our schema validation currently fails if we map a BIGINT column to a property with an int PHP type. While this is technically correct because DBAL 3 hydrates BIGINT values as strings, the error is also unnecessarily strict:

The ORM will cast strings returned by DBAL back to int, if that's the property type. The value range of BIGINT and PHP's int (on 64bit machines) is the same. Storing BIGINT values as string is only necessary if we run on 32bit PHP or we deal with BIGINT UNSIGNED on MySQL/MariaDB. Both are edge cases a project may validly choose to ignore.

Emitting this error encourages downstream projects to change their property types of BIGINT columns from int to string which is unfortunate given that DBAL 4 moved towards returning integers if possible.

derrabus commented 5 months ago

This will probably make us revert most of #11399 by @ThomasLandauer when merging up.

acoulton commented 5 months ago

@derrabus @greg0ire I don't think this is valid - at least not for orm 2.x and dbal 3.x.

Relying on PHP to cast the value to an int when assigned to the entity appears to work on the surface. But under the hood, it means that with default change tracking, $entity_manager->flush() will always treat every instance of that entity as changed and issue an extra UPDATE for each one.

This is because https://github.com/doctrine/orm/blob/b725908c83a5ef7fdc06e0053e7fcfd54d34340b/src/UnitOfWork.php#L787-L790 compares the current property on the entity with the original value that came back from dbal using ===. And that value will still be a string, so '1' === 1 fails and the property will be added to the changeset even if the entity has not actually changed.

We were bitten by this in production a while ago and realised that in fact you can only safely use bigint as a php int by registering a custom DBAL type returning an int, so that UnitOfWork is comparing against the correct original value. This is the case for any situation where the entity property typecasts the value returned from the underlying DBAL type.

greg0ire commented 5 months ago

And that value will still be a string

Are you sure?

DBAL 4 moved towards returning integers if possible.

acoulton commented 5 months ago

@greg0ire with DBAL 3.x it will be, yes - I tested locally.

With DBAL 4.x I suppose it will vary - in theory if you had code running on different platforms then sometimes the data in the UnitOfWork->originalEntityData will be a string and sometimes an int (depending on actual value, it won't be consistent to the platform). So you might sometimes get a false-dirty entity if you have typed the entity to always take int. Although that's probably quite a rare edge case.

greg0ire commented 5 months ago

I think you're correct. On doctrine/orm 2 we should recommend string and nothing else, I should have caught this. @derrabus should we revert?

derrabus commented 5 months ago

I'll look into it.

derrabus commented 5 months ago

My main problem is this:

Emitting this error encourages downstream projects to change their property types of BIGINT columns from int to string which is unfortunate given that DBAL 4 moved towards returning integers if possible.

I don't want to projects to do big migrations to string properties just to revert back to integers once DBAL 4 is installed. That's why I'd rather not emit an error in this special case. But you are right, with BIGINT mapped to integers we run into the very problem that motivated introducing this validation.

acoulton commented 5 months ago

The safest thing might be to update UnitOfWork to cope with this case and e.g. cast both original & current value back to string internally when it checks to see if a bigint column has changed.

I think that would still detect genuine changes, but avoid any risk of accidental changesets across both DBAL 3&4 and all platforms / possible values?

Similar to how UnitOfWork already has a workaround to cast enums in the entity properties back to strings before comparing to the DBAL value

Off the top of my head I don't think that would break any existing behaviour if it was possible to apply it only to bigint columns?