thoughtbot / paperclip

Easy file attachment management for ActiveRecord
https://thoughtbot.com
Other
9.01k stars 2.43k forks source link

#reprocess! original images lost #1055

Closed ScotterC closed 9 years ago

ScotterC commented 12 years ago

I don't have a specific error narrowed down yet but I thought I should start the discussion.

In the midst of reprocessing several thousand images, I lost about 300-400 of them. I was using a resque job to iterate through my ActiveRecord objects and spinning off delayed_paperclip jobs to reprocess each one.

The odd part about this is that when they are 'lost', they still exist as a file in the original style but it's just a 0kb size file without any other styles processed. I was under the impression that reprocess wouldn't touch the original file.

Other pertinent information is that this is with the latest version of paperclip (3.3.0) and the images are being stored on S3

sikachu commented 12 years ago

Can I see your has_attach_file definition, so I can investigate further?

ScotterC commented 12 years ago

heres the has_attach_file code

  has_attached_file :attachment,  :styles => {
                                  :product =>       '240x240>',
                                  :fw_tiny =>       {
                                                      geometry: '260x', 
                                                      convert_options: "-quality 50 -strip"
                                                      },
                                  :fw_small =>      {
                                                      geometry: '320x', 
                                                      convert_options: "-quality 50 -strip"
                                                    },
                                  :fw_medium =>     '395x', 
                                  :fw_large =>      '470x',  },
                      :convert_options => { :all => "-quality 75 -strip" },
                      :processors => [:thumbnail, :optimize_thumbnail],
                      :default_style => :product,
                      :default_url => "/assets/product/missing.png",
    :storage => :s3,
    :s3_credentials => "#{Rails.root.to_s}/config/s3.yml",           
    :s3_protocol => 'https',
    :s3_permisions => :public_read,
    :bucket => CONFIG[:bucket],
    :s3_host_alias => CONFIG[:cloudfront],
    :url => ':s3_alias_url',
    :key => "value", 
    :s3_headers => { 'Cache-Control' => 'max-age=315576000', 'Expires' => 10.years.from_now.httpdate },
  }

After fixing the issue and having to find out exactly what images were lost, I've gathered a bit more info. For about half of the lost images, the S3 bucket had an original image with the original filename and a new 0kb image with a normalized file name. For the other half, it was only the 0kb image with normalized file name.

I managed to duplicate that circumstance by attempting to copy an S3 object by just changing it's key name. This created the same circumstance of preserving the original and creating a 0kb file.

As I mentioned, I was using delayed_paperclip and resque. A bunch of the delayed_paperclip jobs had dirty exits at one point, as usual in that situation, I just had the jobs rerun. I believe that somewhere in those dirty exits, the files were being copied from one filename to the other and were lost when the workers exited.

Is there point in paperclip processing code where the original is being moved to where it could be deleted? Or, I guess, it could be more of a network communication problem and S3 just got cut off at an unfortunate moment.

anderslemke commented 11 years ago

I've seen the exact same behaviour. In most cases, though, we've experiencing the original simply being deleted.

What I've found out, is that when doing a reprocess!, Paperclip actually deletes and creates all files, including the original.

From looking at the logs from our S3 bucket, it seems, that in some quite rare cases, the DELETE and PUT commands on S3 is not performed in the desired order. So the last thing getting executed, is the DELETE, and the file is lost.

I've proposed a fix for this: https://github.com/thoughtbot/paperclip/pull/1354

Basically, it will preserve the files when reprocessing, hence, not doing the DELETE.

ScotterC commented 11 years ago

:+1:

jkloian commented 10 years ago

Having same problem here using Fog storage adapter with Rackspace.

In my instance I call reprocess! but only on one style (Model.image.reprocess! :style). 2 times out of ten the original images goes missing.

jkloian commented 10 years ago

This: https://github.com/thoughtbot/paperclip/pull/1354

fixed my problem.

maclover7 commented 9 years ago

@jferris @jyurek Please close issue, problem appears to be solved.

zreitano commented 9 years ago

I am currently having this issue where reprocess! deletes the original file (leaving a 0kb file remaining).

I am using the most current version of paperclip.

Here is my image.rb

class Image < Post

  Paperclip.interpolates :id do |attachment, style|
    attachment.instance.id
  end

  #basename/extension is paperclip interpolation from attachment, can write own interpolations as needed
  has_attached_file :image, path: "/posts/images/:id/:style.:extension",
  styles: {
    square: "1000x1000#",
    instagram_size: "640x640#"
    # > is conserve aspect ratio
    # square: "200x200#",# # is crop
  }

  #only image types
  validates_attachment_content_type :image, :content_type => ["image/jpg", "image/jpeg", "image/png", "image/gif"]
  # validates_attachment_file_name :image, :matches => [/png\Z/, /jpe?g\Z/]
  # validates_attachment :image, :presence => true

  # do_not_validate_attachment_file_type :image

  def image_url
    photo = self.image
    if photo.present?
      return photo.url(:instagram_size)
    end
  end

 end
sashazykov commented 9 years ago

We are having this issue too. Is it possible to reprocess single style only, and don't touch original version at all?

md-farhan-memon commented 8 years ago

Hi! We are on Rails 3.2.22, Ruby 2.2.4 and use paperclip v4.3.7. This includes the fix given in #1354 but still the images on s3 while reprocessing a single style, deletes that image.

` has_attached_file :photo, :styles => { :thumb => "50x50#", :small => "225x257#", :small_m => "225^", :small_mo => "151x173#", :large => "350x400>", :large_m => "350x400>", :zoom => "800x1100"},

:convert_options => {
  :thumb => "+profile -strip -quality 70 -interlace Plane -units PixelsPerInch -density 1x1",
  :small => "+profile -strip -quality 70 -interlace Plane -units PixelsPerInch -density 1x1 -background white -flatten +matte -gravity center",
  :small_m => "+profile -strip -quality 30 -interlace Plane -units PixelsPerInch -density 1x1 -background white -flatten +matte -gravity center",
  :small_mo => "+profile -strip -quality 70 -interlace Plane -units PixelsPerInch -density 1x1 -background white -flatten +matte -gravity center",
  :large => "+profile -strip -interlace Plane -units PixelsPerInch -density 1x1",
  :large_m => "+profile -strip -quality 70 -interlace Plane -units PixelsPerInch -density 1x1 -background white -flatten +matte -gravity center",
  :zoom => "+profile -strip -quality 70 -interlace Plane -units PixelsPerInch -density 1x1",
}

`

And I am running this through console: Image.find(SOME_ID).photo.reprocess!(:small_m)

Please provide a solution or fix to this.