magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.47k stars 9.28k forks source link

Product images are being duplicated on import #21885

Closed simonworkhouse closed 3 years ago

simonworkhouse commented 5 years ago

When importing product data the product images are being duplicated for each import run.

Preconditions

  1. Magento 2.4-develop
  2. PHP 7.3
  3. MySQL 5.7.25

Steps to reproduce

  1. composer create-project --repository-url=https://repo.magento.com/ magento/project-community-edition <install dir>
  2. cd <install dir>
  3. ./bin/magento setup:install --backend-frontname=? --db-host=? --db-name=? --db-user=? --db-password=? --base-url=? --admin-user=? --admin-password=? --admin-email=? --admin-firstname=? --admin-lastname=?
  4. mkdir var/import-images
  5. Add images test-1.jpg, test-2.jpg, test-3.jpg and test-4.jpg inside the path <install dir>/var/import-images. Any valid jpeg will do.
  6. Import the product CSV data below twice
"sku","product_online","website_id","store_view_code","attribute_set_code","product_type","categories","name","description","short_description","visibility","price","special_price","special_price_from_date","special_price_to_date","url_key","meta_title","meta_keywords","meta_description","display_product_options_in","msrp_display_actual_price_type","additional_attributes","qty","out_of_stock_qty","use_config_min_qty","is_qty_decimal","allow_backorders","use_config_backorders","min_cart_qty","use_config_min_sale_qty","max_cart_qty","use_config_max_sale_qty","is_in_stock","use_config_notify_stock_qty","manage_stock","use_config_manage_stock","use_config_qty_increments","qty_increments","use_config_enable_qty_inc","enable_qty_increments","is_decimal_divided","deferred_stock_update","use_config_deferred_stock_update","related_skus","crosssell_skus","upsell_skus","custom_options","bundle_price_type","bundle_sku_type","bundle_price_view","bundle_weight_type","bundle_values","associated_skus","base_image","base_image_label","small_image","small_image_label","thumbnail_image","thumbnail_image_label","swatch_image","swatch_image_label","additional_images","additional_image_labels","configurable_variations"
"some-test-product",1,1,,"Default","simple","Default Category/Some Category/Some Test Product","Some test product","This is just some test product",,"Catalog, Search",2999,,,,"some-test-product","Some test product",,,"Block after Info Column","Use config",,100,0,1,0,0,1,1,0,0,1,1,1,0,1,1,0,1,0,0,0,1,,,,,,,,,,,"test-1.jpg",,"test-1.jpg",,"test-1.jpg",,"test-1.jpg",,"test-3.jpg",,
"another-test-product",1,1,,"Default","simple","Default Category/Some Category/Some Test Product","Some test product","This is just some test product",,"Catalog, Search",2999,,,,"another-test-product","Another test product",,,"Block after Info Column","Use config",,100,0,1,0,0,1,1,0,0,1,1,1,0,1,1,0,1,0,0,0,1,,,,,,,,,,,"test-2.jpg",,"test-2.jpg",,"test-2.jpg",,"test-2.jpg",,"test-4.jpg",,

Import settings

Expected result

Products end up with each product image only attached once.

Actual result

Products end up with all product images duplicated.

2020-09-23_16-35

magento-engcom-team commented 5 years ago

Hi @simonworkhouse. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@simonworkhouse do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

magento-engcom-team commented 5 years ago

Hi @engcom-backlog-nazar. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

dipti2jcommerce commented 5 years ago

@magento-engcom-team give me 2.3-develop instance

magento-engcom-team commented 5 years ago

Hi @dipti2jcommerce. Thank you for your request. I'm working on Magento 2.3-develop instance for you

magento-engcom-team commented 5 years ago

Hi @dipti2jcommerce, here is your Magento instance. Admin access: https://i-21885-2-3-develop.instances.magento-community.engineering/admin Login: admin Password: 123123q Instance will be terminated in up to 3 hours.

magento-engcom-team commented 5 years ago

:white_check_mark: Confirmed by @engcom-backlog-nazar Thank you for verifying the issue. Based on the provided information internal tickets MAGETWO-98840 were created

Issue Available: @engcom-backlog-nazar, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

tnsezer commented 5 years ago

I would like to share some point about the issue. It happens in Magento\CatalogImportExport\Model\Import\Product.php line 1772. If add a variable for exists images in mapper like _media_is_disabled, it should be fix it but the code try to match it with path. That why it doesn't work.

dmanners commented 5 years ago

Thank you for the information @tnsezer

erfanimani commented 5 years ago

Duplicate of #14398

See PR here: https://github.com/magento/magento2/pull/21146

erfanimani commented 5 years ago

Actually, there's a better PR here, based on mine but it also deals with image deletion: https://github.com/magento/magento2/pull/21855

dmanners commented 5 years ago

@erfanimani thanks for the update. Are the issues duplicated or the PRs?

erfanimani commented 5 years ago

@dmanners Both actually — this issue is a duplicate of #14398

But then there are two open PRs (I referenced them both in case someone wants to create a patch for it):

soundararajanm1990 commented 5 years ago

I am working on this at #dmcdindia1

magento-engcom-team commented 5 years ago

@soundararajanm1990 thank you for joining. Please accept team invitation here and self-assign the issue.

m2-assistant[bot] commented 5 years ago

Hi @soundararajanm1990. Thank you for working on this issue. Looks like this issue is already verified and confirmed. But if you want to validate it one more time, please, go though the following instruction:

yogeshkhasturi commented 5 years ago

Hi @soundararajanm1990,

Have you fixed this issue? its also coming when trying to Duplicate product.

Thanks Yogesh

duckchip commented 5 years ago

what's the status on this issue?

soundararajanm1990 commented 5 years ago

We can't solve this issue. Still now pending.

Beagon commented 5 years ago

We are encountering this too and since the import is ran daily this really clutters up the media gallery. Is there any workaround anyone can recommend?

kevinvuillemin commented 5 years ago

I have same problem, magento2.3.2 duplicated images after import product csv.

Did you find a solution ?

erfanimani commented 5 years ago

Two workarounds exist, scroll up to see the relevant pull requests/commits. You'll need a developer to create and apply Composer patches for your Magento version. Seems Magento is too busy to review/merge and fix the bug themselves though..

Beagon commented 5 years ago

@erfanimani Yes, we indeed used a combination of the solutions (with some minor edits). Really odd that Magento doesn't look at the fixes because the importer is a vital part of most webshops.

kevinvuillemin commented 5 years ago

Thank you @erfanimani

With a combination of the solutions, the import works very well.

mackieee commented 5 years ago

+1 Hoping to see this merged in soon, big help this :)

skapin commented 5 years ago

Wow same trouble for us... This is awsome magento let this kind of BUG in a 2.3.2 version released worldwild...

skapin commented 5 years ago

@kevinvuillemin can you share the working solution ? I'm strugeling with all the PR closed and open speaking around it. Thank you in advance

kevinvuillemin commented 5 years ago

@skapin, of course ! how do you want me to share the file?

skapin commented 5 years ago

@kevinvuillemin you can send me a mail to skapinthefourb gmailcom address for example, or create a repository

As you want. Thank you so much :)

kevinvuillemin commented 5 years ago

@skapin that's good for you ?

https://gist.github.com/kevinvuillemin/802cc5e4b476e73c1c5838b143bd6a51

skapin commented 5 years ago

@kevinvuillemin It works on the first try. Copy/paste instead of Product.php and voila ! Magento version : 2.3.2

kevinvuillemin commented 5 years ago

Et voila ! Au plaisir ;-)

soundararajanm1990 commented 5 years ago

@kevinvuillemin Your code not working in Magento version : 2.3.1

kevinvuillemin commented 5 years ago

Working with 2.3.2 for me, are you sure ?

soundararajanm1990 commented 5 years ago

yes we used 2.3.1 not working

skapin commented 5 years ago

@soundararajanm1990 I think they have changed the Product.php class for the release of Magento 2.3.2.

The easier is to upgrade (maintenance & safety)

mackieee commented 5 years ago

@kevinvuillemin They moved the issue #14398 onto a new phase - does this mean your PR is ready for Testing on the next Dev version?

ahmadwaliesipick commented 4 years ago

any update on this issue?

Yonn-Trimoreau commented 4 years ago

Below is a more permanent solution.

Be careful, this will trash previous duplicates if there were any.

Make a preference on Magento\CatalogImportExport\Model\Import\Product and copy/paste this code in it:


use const DIRECTORY_SEPARATOR;

class Product extends \Magento\CatalogImportExport\Model\Import\Product
{
    private $bunch;
    private $imagesToRemove;

    protected function getExistingImages($bunch)
    {
        $this->bunch = $bunch;

        return parent::getExistingImages($bunch);
    }

    public function addImageHashes(&$imagesBySku)
    {
        parent::addImageHashes($imagesBySku);

        // get existing images (from db)
        $existingImages = [];
        foreach ($imagesBySku as $sku => $images) {
            foreach ($images as $path => $imageInfo) {
                if (!isset($existingImages[$imageInfo['hash']])) {
                    $existingImages[$imageInfo['hash']] = [];
                }

                $existingImages[$imageInfo['hash']][] = [
                    'value_id' => $imageInfo['value_id'],
                    'path' => $path
                ];
            }
        }

        // get imported images (from file)
        $importDir = $this->_mediaDirectory->getAbsolutePath($this->getImportDir());

        $importedImages = [];
        foreach ($this->bunch as $rowData) {
            foreach ($this->getImagesFromRow($rowData)[0] as $imagesFromRow) {
                $imageNames = explode($this->getMultipleValueSeparator(), $imagesFromRow[0]);

                $imageHashes = array_flip(array_map(function($imageName) use ($importDir) {
                    $filename = $importDir . DIRECTORY_SEPARATOR . $imageName;

                    return $this->_mediaDirectory->isReadable($filename) ? md5_file($filename) : '';
                }, $imageNames));

                $importedImages = array_merge($importedImages, $imageHashes);
            }
        }

        // guess images to remove
        $this->imagesToRemove = array_diff_key($existingImages, $importedImages);
    }

    protected function _saveMediaGallery(array $mediaGalleryData)
    {
        // remove duplicate images
        $valueIds = [];
        if (!empty($this->imagesToRemove)) {
            // from disk
            foreach ($this->imagesToRemove as $imagesToRemove) {
                foreach ($imagesToRemove as $imageToRemove) {
                    $imagePath = 'pub/media/catalog/product' . $imageToRemove['path'];

                    if ($this->_mediaDirectory->isExist($imagePath)) {
                        $this->_mediaDirectory->delete($imagePath);
                    }

                    $valueIds[] = $imageToRemove['value_id'];
                }
            }

            // from database
            $this->getConnection()->delete(
                $this->getConnection()->getTableName('catalog_product_entity_media_gallery'),
                $this->getConnection()->quoteInto('value_id IN (?)', $valueIds)
            );
        }

        return parent::_saveMediaGallery($mediaGalleryData);
    }
}
mackieee commented 4 years ago

Well sadly this looks like it missed the 2.3.3 ship :(

kevinvuillemin commented 4 years ago

Realy ? That’s not cool.

But This fix working ? :

https://gist.github.com/kevinvuillemin/802cc5e4b476e73c1c5838b143bd6a51

simonworkhouse commented 4 years ago

Normally I would be surprised that it has been so long without a resolution, but I have come to expect this from Magento. Anyway, I have had to resolve this issue for a project recently and it appears that the offending commit was https://github.com/magento/magento2/commit/7803eed11f8bf490bf8905ab0577091c7e782156

I worked around this by adding a preference for a class that extends Magento\CatalogImportExport\Model\Import\Product and just overrides the uploadMediaFiles(...) function to force the $renameFileOff variable to be true.

This addresses the issue in our specific use case, but it does not address the underlying issues with the way that images are managed in Magento. Additionally, it will also undo whatever "fix" that commit was attempting to do.

jg-development commented 4 years ago

@simonworkhouse Can you please share the code with me?

jalung commented 4 years ago

Product-revised-version.txt

version 2.3.3 Atttached is the patch that is used to fix the issue in our use case. Magento import is expecting images to have the dynamic paths it created on export. However, the CSV is hand crafted and the image names are plain; "blablabla.jpg" as opposed to "/b/l/a/blablabla_60.jpg". This patch works repeatedly because the CSV file is from the same feed. The patch basically strips off the "/b/l/a/" and "_60" and makes a match to the existing image(s) for that product.

duckchip commented 4 years ago

Does anyone have a fix for 2.3.3 with an including fix to remove duplicates? Our server just exploded to 55gb on images because of the duplication during the import..

duckchip commented 4 years ago

the commit that @simonworkhouse referred to, was the suicide bomber in our case as well. Patching that to the previous implementation fixed it. In my opinion this really should be reverted.

SlowFamily commented 4 years ago

@jalung 's patch worked for me, except for that Magento 2.3.3 (if not earlier versions) renames images all lowercase, so the image name preg_match needs to be case insensitive. Patch with that change here: Product-revised-version.txt

jalung commented 4 years ago

Product-revised-version.txt Fixed case where CSV row has new and existing images.

guido7171 commented 4 years ago

Also missed the 2.3.4 ship. @jalung 's patch is working fine for me in 2.3.4. Still needs more testing, for my case, where images are updated by sku, but have same name.jpg.

hostep commented 4 years ago

Possible fix in https://github.com/magento/magento2/pull/26713

hostep commented 4 years ago

I've seen some change in the codebase of Magento 2.3.5 which might fix this issue. Would be great if somebody could verify this after 2.3.5 is officially released on 28 April.