BrandOriented / PimcoreCiHubConnector

This bundle adds full integration between Pimcore and CI-HUB Connector. Now you can add, delete and download all assets, lock and unlock, get all versions for specified asset and much more.
https://brandoriented.io/
Other
5 stars 6 forks source link

datahub:index:rebuild configName indexes unrelated objects #57

Open NiklasBr opened 4 months ago

NiklasBr commented 4 months ago

Pimcore version(s) affected

11.2.3

Description

When running the command bin/console datahub:index:rebuild mycomfigname I expect it to re-index DataObjets only in the given config.

Instead it also indexes unrelated DataObjects.

How to reproduce

Example config name:

pimcore_data_hub:
    configurations:
        ci_hub_1:
            general:
                active: 'on'
                type: ciHub
                name: ci_hub_1
                description: ''
                modificationDate: 1716993974
                path: null
                createDate: 1715938788
            schema:
                assets:
                    enabled: 'on'
                    allowOriginalImage: 'on'
                    thumbnails:
                        - card
                        - detail_print
                dataObjectClasses:
                    -
                        id: Product
                        name: Product
                        columnConfig:
                            id:
                                name: id
                                position: 1
                                hidden: false
                                fieldConfig:
                                    key: id
                                    label: ID
                                    type: system
                            articleNumber:
                                name: articleNumber
                                position: 2
                                hidden: false
                                fieldConfig:
                                    key: articleNumber
                                    label: 'Article number (articleNumber)'
                                    type: input
                            GTIN:
                                name: GTIN
                                position: 3
                                hidden: false
                                fieldConfig:
                                    key: GTIN
                                    label: 'GTIN (13 tall) (GTIN)'
                                    type: input
                            productNumber:
                                name: productNumber
                                position: 4
                                hidden: false
                                fieldConfig:
                                    key: productNumber
                                    label: 'Product number (productNumber)'
                                    type: input
                            productDescription:
                                name: productDescription
                                position: 5
                                hidden: false
                                fieldConfig:
                                    key: productDescription
                                    label: 'Product name (productDescription)'
                                    type: input
                            description:
                                name: description
                                position: 6
                                hidden: false
                                fieldConfig:
                                    key: description
                                    label: 'Product description'
                                    type: wysiwyg
                            urlSlug:
                                name: urlSlug
                                position: 7
                                hidden: false
                                fieldConfig:
                                    key: urlSlug
                                    label: 'URL Slug (urlSlug)'
                                    type: urlSlug
                            available:
                                name: available
                                position: 8
                                hidden: false
                                fieldConfig:
                                    key: available
                                    label: 'Available in market'
                                    type: checkbox
                        language: en
            labelSettings:
                -
                    id: system.id
                -
                    id: system.key
                -
                    id: system.fullPath
                -
                    id: system.parentId
                -
                    id: system.type
                -
                    id: system.subtype
                -
                    id: system.hasChildren
                -
                    id: system.creationDate
                -
                    id: system.modificationDate
                -
                    id: system.locked
                -
                    id: system.versionCount
                -
                    id: system.currentVersion
                -
                    id: system.checksum
                -
                    id: system.mimeType
                -
                    id: system.fileSize
                -
                    id: binaryData.original
                -
                    id: binaryData.card
                -
                    id: binaryData.detail_print
                -
                    id: binaryData.galleryThumbnail
                -
                    id: dimensionData.width
                -
                    id: dimensionData.height
            workspaces: {  }
            permissions:
                user: {  }
                role: {  }

Then run bin/console datahub:index:rebuild ci_hub_1

Possible Solution

No response

Additional Context

No response

github-actions[bot] commented 4 months ago

Thank you for reporting this problem!

This is an open source project, and we rely on the community to help us diagnose and fix issues, as it is not possible to investigate and fix every issue reported to us via GitHub.

If possible, please create a pull request that fixes the problem you describe, along with appropriate tests. All pull requests will be promptly reviewed by the BrandOriented team.

Thank you very much!

labudzinski commented 4 months ago

The method used to create the ES index has been changed to be more explicit. Fixed deficiencies in code.

NiklasBr commented 3 months ago

Seems fixed now.

NiklasBr commented 3 months ago

Sorry but it is still an issue.

Example:

Installation has about 30 000 assets and 14 000 products (variants and objects). Yet it still indexes over 90 000 elements (which appears to match all dataobjects and all assets in total).

NiklasBr commented 3 months ago

@labudzinski I think I see the issue here:

https://github.com/BrandOriented/PimcoreCiHubConnector/blob/a8dc881b39c1297b8e8c6dfd710f21cbc9257790/src/Messenger/Handler/RebuildIndexElementMessageHandler.php#L60-L62

https://github.com/BrandOriented/PimcoreCiHubConnector/blob/a8dc881b39c1297b8e8c6dfd710f21cbc9257790/src/Messenger/Handler/RebuildIndexElementMessageHandler.php#L73

These lines re-indexes all objects in the objects table and does not take into consideration that sometimes a specific set of classes only should be indexed.

labudzinski commented 3 months ago

In the IndexPersistenceService class in the update method: https://github.com/BrandOriented/PimcoreCiHubConnector/blob/d00a8e8a3fa0b87658e0c29ebf839aefe0e6f702/src/Elasticsearch/Index/IndexPersistenceService.php#L321 https://github.com/BrandOriented/PimcoreCiHubConnector/blob/d00a8e8a3fa0b87658e0c29ebf839aefe0e6f702/src/Elasticsearch/Index/IndexPersistenceService.php#L322 We check whether the class is declared in the configuration.

NiklasBr commented 3 months ago

The lines I show above cause a massive unnecessary congestion in the messenger queue by queuing elements which should not be indexed. We can improve performance and reduce strain on the logging system a lot by not doing that in the first place.