keolo / mixpanel_client

Ruby interface to the Mixpanel Data API
MIT License
148 stars 72 forks source link

Allow for control of timeouts #49

Closed csalvato closed 8 years ago

csalvato commented 8 years ago

When running a large data export using the 'export' endpoint, I am happy to wait several minutes as stated in the export docs:

This endpoint uses gzip to compress the transfer; as a result, raw exports should not be processed until the file is received in its entirety. While this process is normally quick and results in a smaller file size, some large exports can take a few minutes to generate. Ensure the timeout set on the receiving client is large enough to account for this process (e.g. larger than 60 seconds).

But the default timeout in mixpanel_client is set to a default of 60-70 seconds. When running a large query, after 60-70 seconds I get this error /Users/csalvato/.rvm/rubies/ruby-1.9.3-p551/lib/ruby/1.9.1/net/protocol.rb:146:inrescue in rbuf_fill': Timeout::Error (Timeout::Error)`

For reference, here is the query:

mixpanel_data = client.request('export', 
                                from_date: "2014-12-17",
                                to_date: "2015-12-28",
                                event: ["Completed Order"]) 

This is about a year's worth of events...so it's a lot. But even with 3 months of events I sometimes hit the timeouts, and we are loading exponentially more events into Mixpanel each month.

Is there any way to increase the timeout period for calls?

channie commented 8 years ago

I am experiencing the same issue and would like to increase the timeout value for export calls as well. @csalvato have you found a way to change the default timeout?

keolo commented 8 years ago

Maybe PR #48 resolves the same issue?

andygeers commented 8 years ago

+1

csalvato commented 8 years ago

@channie My workaround has been to pull data from Mixpanel in 30-day chunks which is always under the timeout limit. Here's my somewhat hacky source:

require 'mixpanel_client'
require 'date'

client = Mixpanel::Client.new(
api_key:    MIXPANEL_API_KEY,
api_secret: MIXPANEL_API_SECRET
)

from_date = Date.parse('2015-12-17')
end_date = Date.today

months_to_retrieve = ((end_date-from_date)/30.0).ceil

adwords_data = []

months_to_retrieve.times do
  to_date = from_date + 30
  if to_date > end_date
   to_date = end_date
  end
  puts "Retrieving #{from_date} to #{to_date}"
  adwords_data.concat(client.request('export', 
                                from_date: from_date,
                                to_date: to_date,
                                event: ["Completed Order"],
                                where: '(properties["latest_ad_search"]) and (properties["latest_ad_utm_source"] == "Google")'))
  from_date = to_date + 1
end
csalvato commented 8 years ago

@keolo possibly

andygeers commented 8 years ago

PR48 worked brilliantly for me

keolo commented 8 years ago

I've merged in PR #48. Please let me know if mixpanel_client v4.1.4 does not work for you.