craftcms / feed-me

Craft CMS plugin for importing entry data from XML, RSS or ATOM feeds—routine task or on-demand.
Other
288 stars 139 forks source link

Importing assets with "Use existing assets" creates duplicates #1348

Closed samlising closed 10 months ago

samlising commented 1 year ago

Recently updated from Feedme 5.1.3.1 to 5.2.0 - Previously I was able to upload assets to an S3 bucket and then add them to Craft by using Feedme to parse an xml file with the filename of the files uploaded to S3. In the latest version of Feedme when parsing the xml file, assets are created in Craft but duplicate files are uploaded to S3 as well even if the feed has "Use existing asset" selected.

I can work around this and add files to Craft by indexing assets and then updating assets by running feedme.

However this is not the most convenient when the asset volume may have over 10k of images and is continuing to grow in size. Indexing thousands of files just to import a few hundred is not very efficient and having the import capability that used to be in the previous version of Feedme would be useful.

i-just commented 1 year ago

Hi, thanks for getting in touch! Could you please provide a snippet of your feed and a screenshot of the mapping screen?

samlising commented 1 year ago

here is a sample of the imported xml and a screenshot of the mapping:

<product_spec_image>
    <image>
      <title>davide-groppi_1a244xx00axx06</title>
      <filename>davide-groppi_1a244xx00axx06.svg</filename>
      <orig_filename>1A244XX00A.XX.06.svg</orig_filename>
      <asset_handle>davide-groppi_1a244xx00axx06-svg</asset_handle>
      <path>/home/forge/dev/santi-import-parser/product_db/davide_groppi/asset_product_spec_image/davide-groppi_1a244xx00axx06.svg</path>
      <url>http://www.davidegroppi.com/Site/Products/1A244-FM/Figurini%20SVG/1A244XX00A.XX.06.svg</url>
      <folder>davide-groppi</folder>
      <caption>
        <en/>
        <it/>
      </caption>
      <tags>
        <tag>Davide Groppi</tag>
        <tag>lighting:table</tag>
        <tag>FM</tag>
        <tag>spec_image</tag>
      </tags>
    </image>
</product_spec_image>
Screenshot 2023-07-20 at 12 02 09 PM
i-just commented 1 year ago

Thank you! I can see what’s happening now. I’ll review our options for this one.

samlising commented 1 year ago

Hi. I'm just wondering if there has been any progress addressing this issue. Thanks.

rickmerkelbach commented 1 year ago

Hi there, I was wondering about the progress on this as well. I've paused quite a few automatic imports since my S3 bill started to grow because of all the duplicated assets. Are there any recommended existing methods to remove those assets in bulk? Or would a custom module with a queue job be the way to go for ±400K assets?

luizboaretto commented 10 months ago

I would also like to know the progress. I'm about to import a lot (700k) of assets...

angrybrad commented 10 months ago

Resolved in https://github.com/craftcms/feed-me/pull/1360 and will be included in the next release.