magento-engcom / import-export-improvements

Open Software License 3.0
31 stars 29 forks source link

Images imported via URL have crazy file paths #57

Closed piotrekkaminski closed 6 years ago

piotrekkaminski commented 6 years ago

From @hgpit on June 28, 2016 9:22

Steps to reproduce

  1. Install Magento v2.1 EE.
  2. Import product CSV using URLs for images (i.e.: base_image: "http://datastore.mydomain.local/imageserver/40354768_big.jpg").

    Expected result

  3. File is stored in Magento file system as [magento]/pub/media/catalog/product/4/0/40354768_big.jpg.

    Actual result

  4. File is stored in Magento file system as [magento]/pub/media/catalog/product/h/t/httpdatastore.mydomain.localimageserver40354768_big.jpg

When importing hundreds of thousands of products/images you end up with literally hundreds of thousands of files just under one folder: [magento]/pub/media/catalog/product/h/t/. So the file distribution logic of directory hierarchy creation based on the first two characters of the filename is futile/wasted - as the files will always be stored under /h/t/.

We are concerned that this could lead to speed issues accessing the /h/t/ folder. Certainly FTPing to the folder from a remote system could cause the FTP clients to hang (or stall for a long period).

Thank you

Copied from original issue: magento/magento2#5306

piotrekkaminski commented 6 years ago

From @hgpit on July 6, 2016 9:35

Are we really the only users who find this approach for file-naming a bit... ridiculous? Surely it's not the intended functionality?

piotrekkaminski commented 6 years ago

From @Ctucker9233 on July 6, 2016 19:26

@hgpit You're not the only one. It seems excessive. But it might be because you can run multiple stores/websites on the same magento setup. If you have any product crossover, there may need to be a way for magento to create a unique file name.

piotrekkaminski commented 6 years ago

From @hgpit on July 6, 2016 19:48

@Ctucker9233 I'm not really sure what you mean, sorry. We do actually run multiple websites and stores. But as every product image is imported via a URL, they're all still filed under [magento]/pub/media/catalog/product/h/t/ regardless of the site/store because of their source path/URL.

We've just invested in Unirgy's uRapidFlow product importer for EE v2.1, and that brings in images (via URLs) "properly", as-in my above expected result. So I still think M2's current folder creations method is wrong/broken.

piotrekkaminski commented 6 years ago

From @Ctucker9233 on July 7, 2016 21:31

@hgpit Ok, I understand now. What I meant is that might be part of Magento's Core structure to create a unique filename for each instance of an image. So if you have the same image on two sites, Magento can tell the difference between the two instances of that image. I'm just speculating this might be the case for the weird file naming.

piotrekkaminski commented 6 years ago

From @andimov on July 29, 2016 11:30

@hgpit Thank you for reporting! datastore.mydomain.local is a weird domain for importing. Try to use local paths.

piotrekkaminski commented 6 years ago

From @hgpit on July 29, 2016 12:9

@andimov "Try to use local paths.". What kind of answer/suggestion is that? Magento 2 provides the facility to import from URLs but because I'm reporting an issue with it, you're suggesting not to use the feature?

Also, how is "datastore.mydomain.local" a weird domain for importing? Aside from the fact I have blatantly replaced the domain name with a generic one for the purpose of example; we have an internal server which serves all our image data, so for instance if we have a corporate product SKU of 12345 we can point any browser/system on our network to http://datastore.mydomain.local/imageserver/12345.jpg and it will return the image. There is nothing weird about it.

piotrekkaminski commented 6 years ago

From @andimov on August 16, 2016 11:59

@hgpit Try to use one of iana list of tlds for importing from URLs. Please, let me know when your issue will be solved.

piotrekkaminski commented 6 years ago

From @hgpit on August 16, 2016 13:27

@andimov I can't (and would not) change my entire enterprise's internal domain naming conventions for Magento.

piotrekkaminski commented 6 years ago

From @shiftedreality on September 9, 2016 9:36

Hi @hgpit

Thank you for reporting. We've created internal ticket MAGETWO-58217 to resolve this issue

piotrekkaminski commented 6 years ago

From @magicsss on April 22, 2017 19:11

Hi there. Does anybody find solution for this issue?

piotrekkaminski commented 6 years ago

From @phrench on June 6, 2017 18:43

We ran into the same issue. The Magento import should definitely not include the path of the external URL in the filename. Bad for SEO beside other reasons.

piotrekkaminski commented 6 years ago

From @joachimVT on October 11, 2017 9:19

I can confirm this, having the same issue in version 2.1.8

pub/media/catalog/product/cache/image/e9c3970ab036de70892d86c6d221abfe/h/t/httpwww.vinesse.bemediacatalogproductpapapeclement2012.png

piotrekkaminski commented 6 years ago

From @magento-engcom-team on October 19, 2017 10:30

@hgpit, thank you for your report. We've created internal ticket(s) MAGETWO-58217 to track progress on the issue.

dromoded commented 6 years ago

If it's not too late to ask, how can I import external images with URLs that don't end with ".jpg"? E.g.: http://asset.lemansnet.com/media/edge/8/4/0/840EC331-41BE-4AFF-9392-4A35F2C486A9.png?x=260&y=260&b=ffffff&t=image/jpeg (It's a real image, and it is indeed a jpeg, parameters "x" and "y" control image size, "b" - background color)

Magento 2.2.1 (CE) Admin/Import confirms validity of the file format, but subsequent import returns with the error: "Imported resource (image) could not be downloaded from external resource due to timeout or access permissions". However,the image actually gets downloaded and saved to /pub/media/import with the following filename httpasset.lemansnet.commediaedge840840EC331-41BE-4AFF-9392-4A35F2C486A9.pngx260y260bfffffftimagejpeg

"Plain" jpeg URLs appear also in /pub/media/catalog/product/h/t, but the "crazy" get stuck in /pub/media/import

dmanners commented 6 years ago

Hi @dromoded I am not sure how this task can work at the moment but we will try to cover this use case when working on this issue.

PieterCappelle commented 6 years ago

Hi, i fixed this issue. > https://github.com/magento/magento2/pull/12872

dmanners commented 6 years ago

Has been fixed via https://github.com/magento/magento2/pull/12872