floriansemm / SolrBundle

Solr-Integration into Symfony and Doctrine2
http://floriansemm.github.io/SolrBundle
MIT License
123 stars 73 forks source link

Multiple Collections in Entity (Nested Documents) #192

Closed barthy-koeln closed 4 years ago

barthy-koeln commented 5 years ago

When mapping collection fields of an entity (example below), the DocumentFactory might come to a point where the document's _childDocuments_ field already exists, filled with the child documents of one collection. Then, if another collection is added to the _childDocuments_, this field is already an array, and the array sent to Document::addField(…) is inserted into the existing array, which results in nested arrays instead of a flat array of child documents.

Example Entity:

/**
 * @ORM\Entity()
 * @Solr\Document()
 */
class Profile
{

    /**
     * @ORM\Id
     * @ORM\Column(type="guid")
     * @ORM\GeneratedValue(strategy="UUID")
     *
     * @Solr\Id
     */
    protected $id;

    /**
     * @ORM\OneToMany(
     *     targetEntity="Entity\Skill",
     *     mappedBy="profileSkill",
     *     cascade={"persist"}
     * )
     *
     * @Solr\Field(nestedClass="Entity\Skill")
     */
    private $skills;

    /**
     * @ORM\OneToMany(
     *     targetEntity="Entity\Experience",
     *     mappedBy="profile",
     *     cascade={"persist"}
     * )
     *
     * @Solr\Field(nestedClass="Entity\Experience")
     */
    private $experiences;

    public function __construct()
    {
        $this->skills = new ArrayCollection();
        $this->experiences = new ArrayCollection();
    }
}

Expected Document Fields:

[
  'id' => '550e8400-e29b-11d4-a716-446655440000',
  '_childDocuments_' => [
    0 => [
      // first document from first collection
    ],
    1 => [
      // second document from first collection
    ],
    2 => [
      // first document from second collection
    ],
    3 => [
      // second document from second collection
    ]
  ]
]

Actual Document Fields:

[
  'id' => '550e8400-e29b-11d4-a716-446655440000',
  '_childDocuments_' => [
    0 => [
      // first document from first collection
    ],
    1 => [
      // second document from first collection
    ],
    2 => [
      0 => [
        // first document from second collection
      ],
      1 => [
        // second document from second collection
      ],
    ]
  ]
]

Since the last child document sent to solr looks like it's field names are the array indices ('0' and '1' in this example), solr sends a 400 error saying "unknown field '0'".

Even though this seems to be a solarium bug, this bundle can easily work around it using the following DocumentFactory::mapCollectionField() function:

    /**
     * @param Field  $field
     * @param string $sourceTargetClass
     *
     * @return array
     *
     * @throws SolrMappingException if no getter method was found
     */
    private function mapCollectionField($document, Field $field, $sourceTargetObject)
    {
        /** @var Collection $collection */
        $collection = $field->getValue();
        $getter = $field->getGetterName();

        if ($getter != '') {
            $collection = $this->callGetterMethod($sourceTargetObject, $getter);

            $collection = array_filter($collection, function ($value) {
                return $value !== null;
            });
        }

        $values = [];
        if (count($collection)) {
            foreach ($collection as $relatedObj) {
                if (is_object($relatedObj)) {
                    $values[] = $this->objectToDocument($relatedObj);
                } else {
                    $values[] = $relatedObj;
                }
            }

            $fields = $document->getFields();
            if (isset($fields['_childDocuments_']) && is_array($fields['_childDocuments_'])) {
                foreach ($values as $value) {
                    $document->addField('_childDocuments_', $value, $field->getBoost());
                }
            } else {
                $document->addField('_childDocuments_', $values, $field->getBoost());
            }
        }

        return $values;
    }

What do you think, should this be fixed/changed on solarium's side? Or should this bundle take care of properly creating a flat child documents array?

I can send a pull request if needed.

barthy-koeln commented 5 years ago

I just realised that the same thing happens if the second field is a single child document and not a collection. In oder to fix this, one would have to apply the same as above to single nested objects in DocumentFactory::createDocument():

[…]
if (is_object($fieldValue) && $field->nestedClass) { // index sinsgle object as nested child-document
    $fields = $document->getFields();
    $value = $this->objectToDocument($fieldValue);
    if (isset($fields['_childDocuments_']) && !is_array($fields['_childDocuments_'])) {
        $value = [$value];
    }

    $document->addField('_childDocuments_', $value, $field->getBoost());
}
[…]