doctrine / mongodb-odm

The Official PHP MongoDB ORM/ODM
https://www.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/
MIT License
1.09k stars 504 forks source link

Wrong mapping of $id field. #1674

Closed iskyd closed 3 years ago

iskyd commented 6 years ago

I think this is a wrong behaviour about mapping field in an embedded document. I'm using version: 3.1.0

This is my example:

Main document

/**
 * @ODM\Document
 */
class Main
{
    /**
     * @ODM\Id
     */
    private $id;

    /**
     * @ODM\EmbedMany(targetDocument="Embedded")
     */
    protected $embedded;
}

Embedded document

/**
 * @ODM\EmbeddedDocument
 */
class Embedded
{
    /**
     * @ODM\Field(type="string")
     */
    protected $id;

    /**
     * @ODM\Field(type="boolean")
     */
    protected $active;
}

In the embedded document there is a field called $id that is a string type, not @ODM\Id.

The problem comes when i try to query that field.

$qb
      ->field('embedded')->elemMatch($qb->expr()
          ->field('id')->equals(1)
     )
;

$qb->getQuery()->execute();

This will result in this query: "embedded":{"$elemMatch":{"_id":{"$id":"5a0d7aee26567e3c36166391"}}}

But i suppose to have: "embedded":{"$elemMatch":{"id":1}}

So I investigate in the mapField function and i noticed that:

Doctrine\ODM\MongoDB\Mapping\ClassMetadataInfo line 1063: public function mapField(array $mapping)

  if (isset($mapping['id']) && $mapping['id'] === true) {
            $mapping['name'] = '_id';
            $this->identifier = $mapping['fieldName'];
            ...
 }
malarzm commented 6 years ago

If I get the problem correctly, it stems from both embedded and parent document having id field. To pinpoint, thing is that you use $qb->expr()->field('id')->equals(1) and builder tries to help you by converting id to its database representation, and as far as builder knows, it is supposed to work with Main::class where id is @ODM\Id. I'm afraid there's nothing really we can do, I'd suggest using a different field name in the embedded document or diving a bit deeper into builder to see if the problem can be somehow worked-around.

alcaeus commented 6 years ago

@malarzm I'm not sure that's the issue. The query builder will look for mapping information in the metadata class for Main, not for Embedded. Renaming the field in the embedded document will result in no type conversion. Mapping would properly do type conversion if you called elemMatch('embedded.id'), but the resulting query would be invalid ().

@iskyd Assuming you have a document manager along your query builder, could you try this?

$embeddedExpr = new Doctrine\ODM\MongoDB\Query\Expr($documentManager);
$embeddedExpr->setClassMetadata($documentManager->getClassMetadata(Embedded::class));

$qb
    ->field('embedded')->elemMatch(
        $embeddedExpr->field('id')->equals(1)
    )
;

$qb->getQuery()->execute();
iskyd commented 6 years ago

@alcaeus that give me the right mongo query. "embedded":{"$elemMatch":{"id":1}}

alcaeus commented 6 years ago

Yeah, I expected it would. I’ll flag it as a bug and see if I can come up with something.

KarunaGovind commented 6 years ago

This is also true for non-embedded documents. Consider the following (bad) example:

{ "_id" : ObjectId("5a7b3a589e3af1e2df910e8f"), "id" : 12345, "name" : "Blah" }

Despite the bad naming (_id vs id), this is a valid Mongo document (resulting from a direct import from a different data source). A direct query on the collection will work as expected but this can't be mapped using the following:

/** @ODM\Id */
protected $id;

/** @ODM\Key(name="id") */
private $_externalId;

So that means we can't do:

$dm->getRepository($classname)->findOneBy(['name' => 'Blah', 'id' => $externalId])

But this works: $collection->findOne(['name' => 'Blah', 'id' => $externalId])

webdevilopers commented 5 years ago

I can confirm this behaviour described by @KarunaGovind!

I use XML mapping and have two fields "_id" (the MongoId Object") and an extra ID "id".

        <field fieldName="id" name="_id" id="true" type="string" />
        <field fieldName="employeeId" name="id" id="true" type="string" />

I explicitely set the names to "_id" and "id" but I always get the value of "_id". Maybe the name is normalized and has the "underscore" is removed which would result in "id" in both cases.

I think I once got a warning that "id" was marked as identifier and this should not be changed.

Would be great to have this fixed in the current version too.

alcaeus commented 5 years ago

@KarunaGovind, @webdevilopers

In ODM, both field name and name (aka database name) are immediately reserved and required to be unique. There may not be a field with a name equal to another fields fieldName attribute. The reason for this is because it's possible for you to create a query like this:

$builder->field('someReference')->references($someObject);

In this case, we wouldn't want you to have to know how exactly someReference is mapped (as ID, DBRef or ref). Also, you shouldn't have to know what identifier $someObject uses. So, we use mapping information as well as the someReference fieldName to get to this query object:

{
    "someReference.$id": ObjectId("123456..."),
}

This replicates behaviour from ORM, which is why it was built this way. Unfortunately, in your case this means that you'll have to find a different fieldName for your id field unless you can change the schema (which isn't always possible).

In the original example, this is a bit different: the query expression contains metadata for the original it was built for, which is why the id field is changed to _id. That's why changing the classMetadata instance in the expression object yields the correct result. One possible fix here is adding an optional $field argument to expr() which would allow you to get an expression based on the metadata of a nested field. I'll see if we can fix this in 1.2 or 1.3.

alcaeus commented 5 years ago

I'll see if we can fix this in 1.2 or 1.3.

TL;DR: no way for 1.x, no for 2.0. The version in the roadmap is still up-to-date.

My initial idea was to update the expr method in Query\Builder to take a field name and inject the target mapping for the field name into the newly created Expr object instead of the builder's metadata. While this can easily be done for straight-up field names, but when building queries for nested field names (think embedded1.anotherEmbedded.something), this gets complex quite quickly. The only place where we currently have this logic is DocumentPersister::prepareQueryElement, which I'd rather not touch at this time. In order to actually be able to take care of this in DocumentPersister, we'd also have to change the Expr class to not resolve nested expressions when they are used (which is currently done when using elemMatch, and, or, or any other operator that works on a nested query expression). This introduces a lot of complexity which I feel is not tested enough at this time. Instead, this should be a more in-depth refactoring of the persister logic along with a lot of testing to ensure we're not running into any regressions.

While I understand that it is only a workaround, I would still recommend to manually inject a separate ClassMetadata instance in any expo object until we've properly refactored this to take field names into account when preparing nested query expressions.

webdevilopers commented 4 years ago

We currently switch to 2.0. If I see this correctly than it is still no possible to map the Mongo internal ObjectId Field "_id" to a different property on our model than $id? We would prefer e.g. $surrogateId and use $id for our real Domain Model ID instead of e.g. employeeId.

webdevilopers commented 4 years ago

I will have to correct myself on the last comment. Indeed our solution on order to keep the domain model clean and stay with our internal "$id" property we can do this:

        <id field-name="surrogateId" />
        <field field-name="id" type="string" />
webdevilopers commented 3 years ago

Just stumbled upon this issue again. Any updates?

Refs: https://github.com/doctrine/mongodb-odm/issues/1987

webdevilopers commented 3 years ago

The code mentioned by @iskyd can be found in Doctrine\ODM\MongoDB\Mapping\ClassMetadata l. 1898:

        if (isset($mapping['id']) && $mapping['id'] === true) {
            $mapping['name']  = '_id';
            $this->identifier = $mapping['fieldName'];
            if (isset($mapping['strategy'])) {
                $this->generatorType = constant(self::class . '::GENERATOR_TYPE_' . strtoupper($mapping['strategy']));
            }
alcaeus commented 3 years ago

No updates to the issue. Having two fields share a name, even when one is the database name and the other is the property name in your domain model is not supported. The following mapping for example is invalid:

<id field-name="id" />
<field name="id" field-name="applicationId" type="string" />

Written as annotations, this would correspond to this:

/** @Id */
private $id;

/** @Field(name="id") */
private $applicationId;

The reason this conflicts is that the query builder no longer knows what to do with this builder expression:

$builder->field('id')->equals('value');

If the builder look at field names (meaning property names in your domain model), the resulting query would be:

{ _id: "value" }

However, the builder also supports passing database names, in which case the generated query would look like this:

{ id: "value" }

Since the query builder doesn't know what to do here, this kind of mapping should be avoided at all costs.

I'll note that the original issue wasn't about this, but rather that the query builder doesn't take field metadata into account when building expressions. In the original example, the id field in the elemMatch query operator uses the metadata from the parent document instead of from the embedded document, resulting in an invalid field name translation. That is a separate issue, but could be avoided by modifying the Query\Expr object after instantiation:

$elemMatchExpr = $qb->expr();
$elemMatchExpr->setClassMetadata($dm->getClassMetadata(Embedded::class); // Makes sure correct translation rules are used
$qb
    ->field('embedded')
    ->elemMatch(
        $elemMatchExpr
            ->field('id')
            ->equals(1)
    )
;

Not sure why I didn't think of this before, but that may solve the original problem @iskyd was reporting.

@webdevilopers if there is a different issue you're referring to, please create a new issue and I'll investigate it. Thanks!

webdevilopers commented 3 years ago

Thanks for the instant feedback @alcaeus ! I get your point.

Our use case is the following:

We design persistence-ignorant Domain Models. We add our own IDs as value objects. The MongoDB required _id in our collections is auto-added by MongoDB, we ignore it in our application. This way we can easily switch to a different database without having to care about keys and their naming.

For instance:

/**
 * @MongoDB\Document(collection="account_details", readOnly=true)
 */
class Account
{
    /**
     * @MongoDB\Id()
     * @internal
     */
    protected $id;

    /**
     * @MongoDB\Field(type="string")
     */
    protected $accountId;
}
/**
 * @MongoDB\Document(collection="business_profile_details", readOnly=true)
 */
class BusinessProfile
{
    /**
     * @MongoDB\Id()
     * @internal
     */
    protected $id;

    /**
     * @MongoDB\Field(type="string")
     */
    protected $businessProfileId;

    /**
     * MongoDB\ReferenceOne(targetDocument=Account::class, storeAs="id", name="accountId")
     */
    protected $account;

The issue is that we cannot link these to documents via SonataAdminBundle since it only seems to allow references via properties marked with @Id. This is indeed not related to this bundle per se.

alcaeus commented 3 years ago

I understand, but not entirely. Here's what I'd do:

/**
 * @MongoDB\Document(collection="account_details", readOnly=true)
 */
class Account
{
    /**
     * @MongoDB\Id(type="string", strategy="none")
     */
    protected $accountId;
}
/**
 * @MongoDB\Document(collection="business_profile_details", readOnly=true)
 */
class BusinessProfile
{
    /**
     * @MongoDB\Id(type="string", strategy="none")
     */
    protected $businessProfileId;

    /**
     * MongoDB\ReferenceOne(targetDocument=Account::class, storeAs="id", name="accountId")
     */
    protected $account;
}

This will lead to these documents:

{
    _id: 'abc'
}
{
    _id: 'def',
    accountId: 'abc'
}

I'm not sure I understand what switching databases has to do with this. Here's the same document that doubles as an ORM entity for any SQL-based database:

/**
 * @MongoDB\Document(collection="account_details", readOnly=true)
 * @ORM\Entity(readOnly=true)
 * @ORM\Table(name="account_details")
 */
class Account
{
    /**
     * @MongoDB\Id(type="string", strategy="none")
     * @ORM\Id
     * @ORM\Column(type="string")
     */
    protected $accountId;

    /**
     * @MongoDB\ReferenceMany(targetDocument=BusinessProfile::class, mappedBy="account")
     * @MongoDB\ManyToOne(targetDocument=BusinessProfile::class, mappedBy="account")
     */
    protected $businessProfiles;
}
/**
 * @MongoDB\Document(collection="business_profile_details", readOnly=true)
 * @ORM\Entity(readOnly=true)
 * @ORM\Table(name="business_profile_details")
 */
class BusinessProfile
{
    /**
     * @MongoDB\Id(type="string", strategy="none")
     * @ORM\Id
     * @ORM\Column(type="string")
     */
    protected $businessProfileId;

    /**
     * @MongoDB\ReferenceOne(targetDocument=Account::class, storeAs="id", name="accountId")
     * @MongoDB\ManyToOne(targetDocument=Account::class, inversedBy="businessProfiles")
     */
    protected $account;
}

But what about queries? One identifier is called _id in the database, the other one account_id or accountId depending on naming schema? Query builders cover this:

$managerRegistry
    ->getManagerForClass(Account::class)
    ->createQueryBuilder(Account::class)
    ->field('accountId')
    ->equals('abc')
;

I'm curious which part of your use-case I missed, because I've seen this work for projects. Maybe its due to SonataAdminBundle making false assumptions, in which case I'd happily help fixing this.

alcaeus commented 3 years ago

@iskyd @webdevilopers #2299 fixes the original issue where the wrong ClassMetadata instance was used. I'll revisit the problem of conflicting field names separately, but the PR should fix most issues when using an Expr on an association with its own metadata.