Open mcurland opened 1 year ago
I'll make a pull request with the suggested changes to PDO\statement.php, but I'm not claiming this is the only driver that experiences this issue.
If you include a functional test that reproduces your problem, we shall see what other drivers are affected.
Anyway, I'm unsure if this is really a bug in the DBAL or if it's the ORM that uses the DBAL in a wrong way. I don't think those resources produced by the DBAL for the binary field are meant to be read more than once.
Hi Alexander, thank you for the quick response.
I think it's clear that the ORM is assuming the data passed to the DBAL layer is immutable. Even the parameter names ($value for a resource) indicate the expectation of an immutable value. Apart from initializing a resource (including the rewind when the data is in) the ORM layer doesn't have any handling for individual data types. So, I'm not sure that this is truly a DBAL bug either, but asking for the ORM layer to switch from treating column values as inherently immutable is a huge ask and will produce a major code change.
Basically, by propagating binary data as a resource (instead of, for example, a byte string or byte array) you're in a situation where reading the data modifies it. I don't necessarily disagree with this choice (stream data can be large, so you don't want to copy it arbitrarily into a less stateful structure), but it clearly causes problems. Any time I read it the line before return is a call to rewind.
For my codebase, the DDL, Doctrine classes, and an I/O layer to read client data (change or query) and translate it to the O/RM requests is all generated. There is no custom interaction with the Doctrine entities and the queries and changes are JSON-driven, so the idea that I can't retrieve the population of two collections with one request is really bad (like target another framework bad). In this case, the generated I/O layer can be re-created (through a settings file) to handle UUID values as native/character/binary. The settings are meant to match the expected back end uuid support. All I'm doing is swapping out the type of the identifier. It shouldn't break the code.
I wrote a stripped-down model (pasted below) and generated some basic entity classes for you to work with, but having burned several days on this already I just don't have the bandwidth to spend any more getting the full repository building and testable. Hopefully this helps and will let you generate and run a db to any driver.
<?php
namespace App\Test\BinRef;
use Doctrine\Common\Collections\ArrayCollection; {
/**
* @Entity
* @Table(name="FirstType", uniqueConstraints={@UniqueConstraint(name="FirstType_PK", columns={"firstTypeId"})})
*/
class FirstType {
/**
* @Column(name="firstTypeId", type="integer", nullable=false)
* @Id
* @GeneratedValue
*/
protected $firstTypeId;
/**
* @Column(name="data", type="string", length=48, nullable=true)
*/
protected $data;
/**
* @ManyToOne(targetEntity="TargetType", inversedBy="firstTypeCollection")
* @JoinColumn(name="targetTypeUuid", referencedColumnName="targetTypeUuid")
*/
protected $targetType;
public function getFirstTypeId() {
return $this->firstTypeId;
}
public function setFirstTypeId($value) {
$this->firstTypeId = $value;
}
public function getData() {
return $this->data;
}
public function setData($value) {
$this->data = $value;
}
public function getTargetType() {
return $this->targetType;
}
public function setTargetType($value) {
$current = &$this->targetType;
if ($current !== $value) {
if ($current !== null) {
$current->_syncFirstTypeCollection($this, true);
}
if ($value !== null) {
$value->_syncFirstTypeCollection($this, false);
}
}
unset($current);
$this->targetType = $value;
}
public function __toString() {
return spl_object_hash($this);
}
}
/**
* @Entity
* @Table(name="SecondType", uniqueConstraints={@UniqueConstraint(name="SecondType_PK", columns={"secondTypeId"})})
*/
class SecondType {
/**
* @Column(name="secondTypeId", type="integer", nullable=false)
* @Id
* @GeneratedValue
*/
protected $secondTypeId;
/**
* @Column(name="data", type="string", length=48, nullable=true)
*/
protected $data;
/**
* @ManyToOne(targetEntity="TargetType", inversedBy="secondTypeCollection")
* @JoinColumn(name="targetTypeUuid", referencedColumnName="targetTypeUuid")
*/
protected $targetType;
public function getSecondTypeId() {
return $this->secondTypeId;
}
public function setSecondTypeId($value) {
$this->secondTypeId = $value;
}
public function getData() {
return $this->data;
}
public function setData($value) {
$this->data = $value;
}
public function getTargetType() {
return $this->targetType;
}
public function setTargetType($value) {
$current = &$this->targetType;
if ($current !== $value) {
if ($current !== null) {
$current->_syncSecondTypeCollection($this, true);
}
if ($value !== null) {
$value->_syncSecondTypeCollection($this, false);
}
}
unset($current);
$this->targetType = $value;
}
public function __toString() {
return spl_object_hash($this);
}
}
/**
* @Entity
* @Table(name="TargetType", uniqueConstraints={@UniqueConstraint(name="TargetType_PK", columns={"targetTypeUuid"})})
*/
class TargetType {
/**
* @Column(name="targetTypeUuid", type="binary", length=16, nullable=false)
* @Id
*/
protected $targetTypeUuid;
/**
* @Column(name="targetTypeName", type="string", length=48, nullable=true)
*/
protected $name;
/**
* @OneToMany(targetEntity="FirstType", mappedBy="targetType")
*/
protected $firstTypeCollection;
/**
* @OneToMany(targetEntity="SecondType", mappedBy="targetType")
*/
protected $secondTypeCollection;
public function getTargetTypeUuid() {
return $this->targetTypeUuid;
}
public function setTargetTypeUuid($value) {
$this->targetTypeUuid = $value;
}
public function getName() {
return $this->name;
}
public function setName($value) {
$this->name = $value;
}
public function addToFirstTypeCollection($value) {
$value->setTargetType($this);
}
public function removeFromFirstTypeCollection($value) {
$value->setTargetType(null);
}
public function getFirstTypeCollection() {
return $this->ensureFirstTypeCollection()->toArray();
}
public function _syncFirstTypeCollection($value, $remove) {
$this->ensureFirstTypeCollection()->{$remove ? 'removeElement' : 'add'}($value);
}
public function addToSecondTypeCollection($value) {
$value->setTargetType($this);
}
public function removeFromSecondTypeCollection($value) {
$value->setTargetType(null);
}
public function getSecondTypeCollection() {
return $this->ensureSecondTypeCollection()->toArray();
}
public function _syncSecondTypeCollection($value, $remove) {
$this->ensureSecondTypeCollection()->{$remove ? 'removeElement' : 'add'}($value);
}
private function ensureFirstTypeCollection() {
if (null === ($coll = $this->firstTypeCollection)) {
$this->firstTypeCollection = $coll = new ArrayCollection();
}
return $coll;
}
private function ensureSecondTypeCollection() {
if (null === ($coll = $this->secondTypeCollection)) {
$this->secondTypeCollection = $coll = new ArrayCollection();
}
return $coll;
}
public function __toString() {
return spl_object_hash($this);
}
}
}
?>
If you don't have a UUID creation function handy this set will do (obviously put in a class, expected string format is like 'af8034e4-c954-40a6-b02f-4b224364c3a2'):
private static function createUuid() {
$data = openssl_random_pseudo_bytes(16);
// Set version to 0100
$data[6] = chr(ord($data[6]) & 0x0f | 0x40);
// Set bits 6-7 to 100
$data[8] = chr(ord($data[8]) & 0x3f | 0x80);
return self::resourceFromByteString($data);
}
private static function uuidResourceFromString($uuidString) {
return $uuidString ? self::resourceFromByteString(hex2bin(str_replace('-', '', $uuidString))) : null;
}
private static function resourceFromByteString($byteString) {
$res = fopen('php://temp', 'rb+');
fwrite($res, $byteString);
rewind($res);
return $res;
}
private static function uuidStringFromResource($uuidResource) {
if ($uuidResource) {
$bytes = stream_get_contents($uuidResource);
rewind($uuidResource);
return self::uuidStringFromByteString($bytes);
}
return null;
}
private static function uuidStringFromByteString($bytes) {
if ($bytes && strlen($bytes) === 16) {
$hexBytes = bin2hex($bytes);
return substr($hexBytes, 0, 8) . '-' . substr($hexBytes, 8, 4) . '-' . substr($hexBytes, 12, 4) . '-' . substr($hexBytes, 16, 4) . '-' . substr($hexBytes, 20, 12);
}
return null;
}
$em->getRepository('App\\Test\\BinRef\\TargetType')->findOneBy(['targetTypeUuid' => self::uuidResourceFromString($id)]);
You'll also see failures persisting if you create the objects and try to persist them--you'll get an FK violation for the second serialization ref to the same TargetType entity.
Havent tested yet your commit but I am having a similar issue with the secondary entities.
Bug Report
If an entity has a binary identifier field then the id value is managed as a PHP resource. Of course, a resource is only meaningful if it has a reasonable current position. Specifically, if the value is read more than once then the stream position needs to be restored in between reads.
In my case (Postgres or Oracle over PDO) PDOStatement->execute moves the resource pointer, but this likely fails for other drivers as well. The bottom line is that a resource is not quite an immutable value, so the position needs to be restored between reads to get consistent data.
1) When the entity is populated the id resource is at position 0. 2) When the first lazy load collection (likely eager as well, I just haven't tested) the correct id is produced, so the sql log (postgres, but this is not a postgres-specific issue) looks something like $1='\xabcdef'. 3) When the second lazy load collection is queried the sql log shows $1='\x'. This should be the same as the first value. It obviously never returns data.
Clearly, the resource is being used and not rewound to its original position, so the second use fails. There is absolutely nothing fancy here, I'm just retrieving an entity with two one-to-many lazy load collections. However, similar behavior is seen with a write operation, where the first join succeeds and the second shows a foreign-key violation on the empty binary string data.
I tracked the resource pointer movement to the PDOStatement->execute call in doctrine\dbal\src\Driver\PDO\statement.php. Looking further into the PHP sources (https://github.com/php/php-src/blob/master/ext/pdo/pdo_stmt.c) shows
Which reads the stream but does not restore it. So, short of changing the PDOStatement behavior in PHP, the resources either need to be copied before the call (during parameter binding) or reset to their prior position (usually 0, but not guaranteed) after the call. There is little point in copying resources since the copy also moves the original pointer, which leads to a solution of simply restoring the position after the execute call.
I'll make a pull request with the suggested changes to PDO\statement.php, but I'm not claiming this is the only driver that experiences this issue. The collaborator team will need to decide if this is a dbal issue or needs to be handled higher in the stack (BasicEntityPersister, UnitOfWork->loadCollection, etc). There are obviously a ton of both read and write scenarios that funnel down into PDOStatement->execute, so fixing it higher up the stack would be significantly more code (and might be unnecessary for statement implementations associated with other drivers, TBD).
I'm currently using the 2.8 bundle installed with Symfony.
Summary
Binary identifier in entity can be used once as a parameter to PDOStatement->execute but fails on all subsequent operations.
Current behaviour
How to reproduce
Expected behaviour
This should work the same as any other identifier and allow multiple internal uses.