sirupsen / airrecord

Ruby wrapper for Airtable, your personal database
MIT License
298 stars 59 forks source link

Support batch operations #66

Open EvanHahn opened 5 years ago

EvanHahn commented 5 years ago

The Airtable API now supports creating, updating, and deleting up to 10 records per request. Is this something Airrecord should support?

The documentation at airtable.com/api shows examples. I was the main developer working on this feature at Airtable, so let me know if you have any questions about this!

(I don't personally need this feature for myself, so feel free to close if you wish!)

sirupsen commented 4 years ago

That's cool. Someone who stumbled upon this ticket should consider implementing that. But I'd suggest going a bit back and forth here on a good API first!

Meekohi commented 4 years ago

Yeah agreed, good idea and could be a non-trivial performance improvement. The project I was working on that uses airrecord is on pause, but if it comes back to life I might take a stab at this.

sirupsen commented 4 years ago

Also just saw that Airtable has a metadata API now. That could be really cool to work on Airrecord integration for. Opens up a lot of possibilities where it's harder to do the wrong thing.

aguynamedben commented 4 years ago

I want this for batched updates and might be able to implement it this weekend. How about this API:

tea1 = Tea.find("someid")
tea1["Name"] = "Feng Gang Organic"
tea1["Village"] = "Feng Gang"

tea2 = Tea.find("someotherid")
tea2["Name"] = "Sweet Tea"
tea2["Village"] = "Georgia, USA"

Tea.update_batch([tea1, tea2])

An error would be throw if the array provided has more than 10 records.

It looks like batches of 10 are supported for create, update, and delete. I can't tell if their update API does an "upsert" if a new record is given, or throws an errror. I propose I will first implement Airrecord::Table#update_batch first, then myself of someone else can add Airrecord::Table#create_batch and Airrecord::Table#delete_batch.

Does that work? The API is somewhat similar to ActiveRecord batch queries, but isn't chainable.

sirupsen commented 4 years ago

That seems sensible to me. Not crazy about the name, but I think this is the simplest.

aguynamedben commented 4 years ago

Any ideas for a better name? I'm not partial the the name either.

sirupsen commented 4 years ago

Nah. Let's not sweat it. The functionality would be the same. Introducing relations is not worth it. This maps to the Airtable naming, so let's do what you suggested :) I suspect it won't be very difficult.

sirupsen commented 4 years ago

Maybe one thing I'd suggest is to just call it update, and then also have a update_or_create, as well as a create which can also take an enum.

aguynamedben commented 4 years ago

After working on this some, I realized that my project would really need Table.batch_upsert to create or update in batches accordingly. For my immediate needs, I've decided to do an after_save callback that makes API calls on a per-record basis.

However, I did get Table.batch_update working, and submitted a proof-of-concept pull request here with some more ideas on how to run with it if anybody wants to take it from there.

cloudsbird commented 2 years ago

Hi, any updates regarding this?

JacksonRiso commented 1 year ago

Here is my janky, but workable code to add a batch upsert functionality to the Airrecord gem. Add this to your config/airrecord.rb file:

Airrecord::Table.class_eval do
  # Upsert up to 10 records
  def self.bulk_record_upsert(field_to_merge_on = nil, record = nil, last = false)
    response = nil
    @queue ||= {}
    @queue["#{base_key}-#{client.escape(table_name)}"] ||= []
    this_queue = @queue["#{base_key}-#{client.escape(table_name)}"]
    this_queue.push({ 'fields' => record }) if record
    # Each loop - every 10th iteration or last!
    if this_queue.length > 0 && (this_queue.length % 10 == 0 || last)
      this_queue.each_slice(10).each do |array|
        # Send API call to Airtable
        path = "https://api.airtable.com/v0/#{base_key}/#{client.escape(table_name)}"
        body = {
          "performUpsert": {
            "fieldsToMergeOn": [field_to_merge_on]
          },
          "records": array.uniq
        }
        headers = {
          "Content-Type": 'application/json',
          "Authorization": "Bearer #{api_key}"
        }
        response = HTTParty.patch(path, headers: headers, body: body.to_json)
        Sentry.capture_message(response.body, level: :error) if response.code != 200
      end
      # Empty the queue
      this_queue = []
      @queue["#{base_key}-#{client.escape(table_name)}"] = []
    end
    response
  end
end

If working in a loop, it will wait until it gets called 10 times, before sending the API call. It has drastically improved my performance. To ensure no stragglers get left behind, I implement it something like this;

(1..13).each do |i|
   Table.bulk_record_upsert("COLUMN NAME", {column: i}) 
end
Table.bulk_record_upsert("COLUMN NAME", nil, true) 

This will call the update API 2x --> once on the 10th record being added, and once after the loop is over for the final 3 stragglers.

Let me know if this makes sense and if anyone has any questions! Of course I would love to see this tested & moved into a pull request.

Cheers :-D

JacksonRiso commented 1 year ago

🤷‍♂️ I just wrote this for myself to use. And I didn’t want to have to re-write that logic each time I used it.

On Sun, Jun 18, 2023 at 7:02 AM Soggi @.***> wrote:

I don't think the "wait for 10 records before firing" logic should be internal to airrecord -- most people won't be expecting that behavior unless they read the docs carefully, and that kind of logic can be left to the user of the library pretty easily. It'll also make the actual function much simpler since it doesn't need to store state in the queue between calls.

— Reply to this email directly, view it on GitHub https://github.com/sirupsen/airrecord/issues/66#issuecomment-1596157849, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXSDIMSB2F5APNEIEXW3VLXL4C5VANCNFSM4IXU5NVA . You are receiving this because you commented.Message ID: @.***>

-- Jackson Riso Business Process & Automation Consultant Grab a time with me here https://calendly.com/risojackson/30min-1.