doctrine / orm

Doctrine Object Relational Mapper (ORM)
https://www.doctrine-project.org/projects/orm.html
MIT License
9.87k stars 2.49k forks source link

References with entity identifier are improperly tracked, need flattened identifier. #5640

Open mcurland opened 8 years ago

mcurland commented 8 years ago

Problem: Doctrine\ORM\EntityManager#getReference (also applies to find and getPartialReference) calls unitOfWork->tryGetById without first flattening the identifier, which is done for all calls to tryGetById from inside the UnitOfWork class. Without flattening the identifier the reference does not properly hydrate when it is written to because it cannot find itself in the unit of work based on its own identifier.

A sample stripped down class where the identifier flattening is needed is shown here. Note the identifier is another entity. The highly unsettling behavior I was seeing was that retrieving a reference to this object, writing a simple value property, then flushing the change would delete all other properties on the object (except the identifier). The internal sequence occurs when the first property is written to, which forces the proxy to fetch from the database. However, the fetch fails silently because the entity cannot find itself in the unit of work, resulting in another instance being populated. So, you now have a proxy that thinks it is populate but has a bunch of empty fields. Writing a new field and flushing overwrites all of the old fields in the database! Instant data vaporization. Subsequent complaints.

/**
* @Entity
* @Table(name="garage", uniqueConstraints={@UniqueConstraint(name="garage_PK", columns={"id"})})
*/
class Garage {
    /**
    * @OneToOne(targetEntity="BusinessEntity", inversedBy="garage")
    * @JoinColumn(name="id", referencedColumnName="id")
    * @Id
    */
    protected $businessEntity;
}

The issue is the three calls to unitOfWork->tryGetById in the EntityManager class (in the find, getReference, and getPartialReference methods). I'm using find and getReference in production and have been using this fix for a while, but I do not use getPartialReference. Therefore getPartialReference will need additional similar code for a complete fix of this issue.

I have a basic fix to expose a flattenIdentifer method on unit of work. This works, but i don't consider it elegant. The cleaner approach would be a public tryGetById that accepts an external (meaning non-flattened) identifier then does the right thing. IMO this bug was caused by poor encapsulation in the UnitOfWork class.

mcurland commented 8 years ago

Added pull request https://github.com/doctrine/doctrine2/pull/5641

Ocramius commented 8 years ago

Very sneaky bug you found there. Patch looks ok, but:

1) misses test case 2) we don't want to expose further unit of work public API: we should probably try reducing the public API of it, instead of bloating it even more...

mcurland commented 8 years ago

Hi Marco,

I understand your concerns on the patch. This was meant to be more of a starting point for the core collaborator team than a finished product (it's missing getPartialReference checks apart from anything else). The code presented was sufficient to get my project back on track, and I've been using the fix since 2.5.2. Once I tracked this down to the tryGetById failing, the missing code was readily apparent by comparing the the calls from inside UnitOfWork to tryGetById to the EnityManager calls to tryGetById.

I figured there would be an objection to expanding the UnitOfWork public API, but it already had all of the functionality needed to do this and I didn't want to waste cycles by replicating the flatten helper instance outside unit of work. IMO, the primary issue that caused this in the first place is that UnitOfWork->tryGetById was already public, but expected internal-style (flattened) identifiers to work correctly.

Given that all(?) external calls from the current code base to tryGetById are already broken, one option is to make the current tryGetById private function and make a new public tryGetById that assumes an external, non-flattened identifer. This allows the flattening algorithm to remain inside unit of work, with the small downside of changing invalid existing behavior on a public API. Even the notion of 'flattening' to make an internal id should probably remain private to UnitOfWork.

In any case, this is only a few lines of code that can be refactored as you please. The public API decisions are yours to make, not mine. I just want composer on a new server instance to work without patching, so finally took the time to report this in detail. Running without this change is not an option for me because of the data loss issues. After I learned how to use the library this is the first major issue I've seen, so all of the good work is much appreciated.

-Matt

mcurland commented 8 years ago

Is the doctrine team waiting for me on something? I'm not sure what else I can do here. The design on how to fix this is up to the collaborator team, not me.

This is a data-loss bug and needs to be fixed sooner rather than later.