pimcore / data-importer

This extension adds a comprehensive import functionality to Pimcore Datahub.
Other
38 stars 56 forks source link

[Bug]: AsNumberic operator breaks "if source is empty" functionality #411

Open IronSean opened 3 months ago

IronSean commented 3 months ago

Expected behavior

If you have "if source is empty" selected in your datahub job and the input data is empty, a field will not be updated.

Actual behavior

If you process the empty field "As Numberic" (required to write to a numeric field) the empty value is converted to the value 0 which then overwrites that field when processed.

Steps to reproduce

Create datahub job. Create Mapping for source Attribute with transformation pipeline "As Numeric". Unselect "if source is Empty" in the Data Target Overwrite fields.

Run import, empty values will now overwrite 0 to that field.

IronSean commented 3 months ago

This seems related to the Ehancement at #364 and the bug in #359, but given the interaction with the "overwrite: if source is empty" configuration causes data loss.

Trying to rely on the "if source is empty" setting to allow partial updates instead wiped out multiple value fields: image image

kingjia90 commented 2 months ago

Update: nevermind, it's a numeric field

~Mhh is BasePrice a QuantityValue data type? If yes, could it be this QuantityValue change~ https://github.com/pimcore/data-importer/pull/361/files#diff-30298b0c96a44b2c48f3c55e8b8f4c27e8cf26f3177400c8c2ad9761b860cd7eR147 ? Could you please try to downgrade to 1.8.3 and see if it works as intended? Also maybe it's related to https://github.com/pimcore/data-importer/issues/348 And the fact that is considered empty if unitId is also empty, https://github.com/pimcore/pimcore/blob/ed360646bc658ca7e35dbc090ad92aabacf9c8a7/models/DataObject/ClassDefinition/Data/AbstractQuantityValue.php#L330

kingjia90 commented 2 months ago

Thank you for reporting!

It seems caused by this https://github.com/pimcore/pimcore/blob/40608fb9faff7aaa02d8014761036abb28843a66/models/DataObject/ClassDefinition/Data/Numeric.php#L428-L431 from https://github.com/pimcore/pimcore/commit/13115215602b7c1cad4c3f45bad5e35bf1f735d8

is_numeric(0.0) is true, isEmpty is the opposite, so false empty(0.0) is true

See also https://github.com/pimcore/data-importer/blob/5666dc2511386ecc6f3a3dc4961923fe36abd603/src/Mapping/DataTarget/Direct.php#L157

kingjia90 commented 2 months ago

Don't know how to fix it yet, because before it was return strlen($data) < 1; so it means that (int)0 or (float)0.0 never been considered as actually empty and the logic need to be kept as is.

By looking at the screenshot image is see it was 0, therefore it's not empty (as the PHP function definition), because for empty, we are considering when it's not filled with any value.

github-actions[bot] commented 2 months ago

Thanks a lot for reporting the issue. We did not consider the issue as "Pimcore:Priority", "Pimcore:ToDo" or "Pimcore:Backlog", so we're not going to work on that anytime soon. Please create a pull request to fix the issue if this is a bug report. We'll then review it as quickly as possible. If you're interested in contributing a feature, please contact us first here before creating a pull request. We'll then decide whether we'd accept it or not. Thanks for your understanding.

IronSean commented 1 month ago

Hi @kingjia90 thanks for responding.

What is happening in the screenshot is this:

The source data is empty string: '' But if you want to write it to a Numeric field, you have to cast it to a number first with As Numeric otherwise the destination field is not selectable as a valid field to import. But when you cast '' with As Numeric it parses it as 0 because of the behaviour of floatval() But now the "is empty" checks see's 0 as non-empty as you stated.

I've solved this in a project where this was a blocker by overriding the Direct DataTarget class to treat 0 as empty for the purposes of the writeIfSourceIsEmpty check because for this use case we also wanted 0 to not override a non-zero value:

class DirectDataTargetOverride extends Direct
{
    protected function checkAssignData($newData, $valueContainer, $getter): bool
    {
        $result = parent::checkAssignData($newData, $valueContainer, $getter);

        if ($this->writeIfSourceIsEmpty === false && is_numeric($newData) && $newData == 0)
            return false;

        return $result;
    }
}

I believe I also tried overriding the AsNumeric behaviour to return null instead of 0 if the input was not a valid number. But this caused issues with either the preview values or the execution of the row depending now how I did it.

The ideal behaviour I think is that an empty source string would be able to convert to null, then the datahub could use null properly with the "if source is empty" check to not overwrite in that scenario. But I haven't yet figured out how to implement that.

mohammed-mahmoud-scandi commented 1 month ago

how you were able to overwrite the direct class does anything else required? here's my implementation

servicex.yaml


    App\Service\DirectDataTargetOverride:
        public: true

    Pimcore\Bundle\DataImporterBundle\Mapping\DataTarget\Direct:
        public: true
        class: App\Service\DirectDataTargetOverride

class DirectDataTargetOverride.php

  <?php
namespace App\Service;

use Monolog\Logger;
use Pimcore\Bundle\ApplicationLoggerBundle\ApplicationLogger;
use Pimcore\Bundle\DataImporterBundle\Exception\InvalidConfigurationException;
use Pimcore\Model\DataObject;
use Pimcore\Bundle\DataImporterBundle\Mapping\DataTarget\Direct;
use Pimcore\Model\Element\ElementInterface;

class DirectDataTargetOverride extends Direct
{
    protected function checkAssignData($newData, $valueContainer, $getter): bool
    {
        $result = parent::checkAssignData($newData, $valueContainer, $getter);

        if ($this->writeIfSourceIsEmpty === false && is_numeric($newData) && $newData == 0)
            return false;

        return $result;
    }
}

full sevices.xml file

parameters:
    secret: Hvf1D+DKTkIUwXp/NUPgAm/BfJ5ucaA5tBnIDNUZZb4=

    # customize the full path to external executables
    # normally they are auto-detected by `which program` or auto-discovered in the configured path in
    # System Settings -> General -> Additional $PATH variable
    # but in general it's a good idea to have your programs in your $PATH environment variable (system wide)

    #pimcore_executable_composer: php /opt/vendor/bin/composer.phar
    #pimcore_executable_html2text: /usr/local/html2text/bin/html2text
    #pimcore_executable_soffice: /opt/libreoffice/bin/soffice
    #pimcore_executable_gs: /opt/ghostscript/bin/gs
    #pimcore_executable_pdftotext: /opt/tools/pdftotext
    #pimcore_executable_xvfb-run: /opt/tools/xvfb-run
    #pimcore_executable_pngcrush: /opt/tools/pngcrush
    #pimcore_executable_zopflipng: /opt/tools/zopflipng
    #pimcore_executable_pngout: /opt/tools/pngout
    #pimcore_executable_advpng: /opt/tools/advpng
    #pimcore_executable_cjpeg: /opt/tools/cjpeg
    #pimcore_executable_jpegoptim: /opt/tools/jpegoptim
    #pimcore_executable_php: /usr/local/custom-php/bin/php
    #pimcore_executable_nice: /opt/tools/nice
    #pimcore_executable_nohup: /opt/tools/nohup
    #pimcore_executable_ffmpeg: /opt/tools/ffmpeg
    #pimcore_executable_exiftool: /opt/tools/exiftool
    #pimcore_executable_wkhtmltoimage: /usr/local/bin/wkhtmltoimage
    #pimcore_executable_timeout: /usr/bin/timeout
    #pimcore_executable_facedetect: /usr/bin/facedetect
    # graphviz
    #pimcore_executable_dot: /usr/bin/dot

services:
    # default configuration for services in *this* file
    _defaults:
        # automatically injects dependencies in your services
        autowire: true
        # automatically registers your services as commands, event subscribers, etc.
        autoconfigure: true
        # this means you cannot fetch services directly from the container via $container->get()
        # if you need to do this, you can override this setting on individual services
        public: false
    #
    # CONTROLLERS
    #

    # auto-register all controllers as services
    App\Controller\:
        resource: '../src/Controller'
        public: true
        tags: [ 'controller.service_arguments' ]

    #
    # COMMANDS
    #

    # auto-register all commands as services
    App\Command\:
        resource: '../src/Command/*'
        tags: [ 'console.command' ]

    App\Service\LocaleService:
        public: true

    Pimcore\Localization\LocaleServiceInterface:
        public: true
        class: App\Service\LocaleService

    App\Service\DirectDataTargetOverride:
        public: true

    Pimcore\Bundle\DataImporterBundle\Mapping\DataTarget\Direct:
        public: true
        class: App\Service\DirectDataTargetOverride
    # Example custom templating helper
    # App\Templating\Helper\Example:
    #     # templating helpers need to be public as they
    #     # are fetched from the container on demand
    #     public: true
    #     tags:
    #         - { name: templating.helper, alias: fooBar }

    # Example event listener for objects
    # App\EventListener\TestListener:
    #     tags:
    #         - { name: kernel.event_listener, event: pimcore.dataobject.preUpdate, method: onObjectPreUpdate }

    App\EventListener\DataObjectEventListener:
        tags:
            - { name: kernel.event_listener, event: pimcore.dataobject.postUpdate, method: onObjectPostUpdate }
    App\EventListener\DataObjectPreSaveListener:
        tags:
            - { name: kernel.event_listener, event: 'Pimcore\Bundle\DataImporterBundle\Event\DataObject\PreSaveEvent', method: onPreSave }
IronSean commented 3 weeks ago

My services.yaml file looked like this, to register it with the same tag as the original:

    Pimcore\Bundle\DataImporterBundle\Mapping\DataTarget\Direct:
        class: App\Service\Overrides\DirectDataTargetOverride
        tags:
            - { name: "pimcore.datahub.data_importer.data_target", type: "direct" }