brandonhilkert / sucker_punch

Sucker Punch is a Ruby asynchronous processing library using concurrent-ruby, heavily influenced by Sidekiq and girl_friday.
MIT License
2.65k stars 114 forks source link

Job getting terminated after a while #133

Closed akshaykumar3 closed 8 years ago

akshaykumar3 commented 8 years ago

I am using sucker_punch to send GCM notifications. After a batch of 500 I am adding a sleep of 120 seconds. When I am sending notifications to 11000 people, the Job terminates unexpectedly without much information in the logs. Here is the error I am getting: Celluloid::Task::TerminatedError: task was terminated

Can anyone help?

brandonhilkert commented 8 years ago

Happy to take a look at the script. If you increase the sleep to something really long (300 sec.) does it still happen?

akshaykumar3 commented 8 years ago

Here is the script:

class CustomerNotificationSenderJob 
  include SuckerPunch::Job

  def perform(notification_id)
    begin
        notification = CustomerNotification.where(:id => notification_id).last
        uid_list = Array.new

        if 'Specific'.eql?(notification.notification_type)
            filepath = "/tmp/temp_file.txt"
            size = notification.file_url.split('/').size
            filename = notification.file_url.split('/')[size-1]
            download_file(filename, filepath)
            file = File.open(filepath, "r")
            file.each_line do |line|
                line = line.strip
                uid_list << line
            end

            uid_list = uid_list.uniq

            File.delete(filepath)
            send_notification(notification_id, uid_list, notification.device)
        elsif 'Mass'.eql?(notification.notification_type)
            uid_list = CustomerPushToken.select("u_id, osname")
            uid_list_android = Array.new
            uid_list_ios = Array.new
            uid_list.each do |uid|
                if 'ios'.eql?(customer.osname)
                    uid_list_ios << customer.customer_id
                elsif 'android'.eql?(customer.osname)
                    uid_list_android << customer.customer_id
                end
            end
            send_notification(notification_id, uid_list_android, 'android')
            send_notification(notification_id, uid_list_ios, 'ios')
        end
        notification.status = "SUCCESS"
        notification.save

    rescue Exception => e
        Rails.logger.error "######## Exception: #{e.inspect}"
        Rails.logger.error e.backtrace.join("\n")
        # Rails.logger.info "Response hash: #{response_hash}"
    end
  end

  # Method to send 1000 notifications at a tme.
    def send_notification(notification_id, uid_list, client)
        i = 0
        max_allowed_size = 500
        uids = Array.new

        uid_list.each do |uid|
            if i == max_allowed_size
                Rails.logger.info "Sending notification "
                PushService.send_customer_notification(notification_id, uids, client)
                Rails.logger.info(uids)
                sleep(120)
                uids = Array.new
                i = 0
            end
            uids << uid.gsub(/\n/,"")
            i+=1
        end

        if uids.size > 0
            Rails.logger.info "Sending notification "
            PushService.send_customer_notification(notification_id, uids, client)
            Rails.logger.info(uids)
        end
    end

    def download_file(original_filename, destination_file)
        filename = "notifications/"+"#{original_filename}"
        s3 = Aws::S3::Client.new(
          credentials: Aws::Credentials.new(configatron.aws_s3_access_key_id, configatron.aws_s3_secret_access_key),
          region: configatron.aws_s3_region
        )
        File.open(destination_file, 'wb') do |file|
          s3.get_object(bucket: configatron.aws_s3_bucket, key: filename) do |chunk|
            file.write(chunk)
          end
        end
    end
end
brandonhilkert commented 8 years ago

There's a lot going on in here, it's a little hard to follow. Any time you introduce sleep, you run the risk of guessing the wrong. And if you guess wrong, and the script terminates, you'll get the error you describe. Functionally, I think SP is doing the right thing. It might just synchronizing your script to be more organized so you don't have to rely on a guess that might be incorrect. Stack overflow might be a good place to ask for optimizations.