nesquena / rabl

General ruby templating with json, bson, xml, plist and msgpack support
http://blog.codepath.com/2011/06/27/building-a-platform-api-on-rails/
MIT License
3.65k stars 333 forks source link

Performance optimization ... #49

Closed pkieltyka closed 12 years ago

pkieltyka commented 13 years ago

Hey guys,

Has anyone else noticed that some rabl views can take very long to render? I have API endpoints with 1800ms time in the view and just 226ms with ActiveRecord. I'm using Rails 3.1.0.rc4 by the way, but I also tried this with Rails 3.0.9 with similar results.

Peter

nesquena commented 13 years ago

Wow that seems unusually slow. I've benchmarked most of my views against the to_json equivalent and found them to be essentially comparable. 1800ms vs 226ms means something else is probably going on there. I would be interested to see if you used new relic developer (locally) to benchmark or perftools if it would actually show that much time being spent in RABL. Based on my experiences with it so far, seems a bit unlikely (or at least it must be tied to some specific case).

pkieltyka commented 13 years ago

I will investigate and report back. Thx for the quick response.

nesquena commented 13 years ago

Yeah thanks let me know what you find. As I said, I have done a bunch of internal benchmarking and I generally find a RABL view and a to_json output (when the output matches exactly) to take roughly the same time to complete as part of a request. The problem is almost always at the database level (i.e too many queries, slow query, N+1, etc) once you start adding child nodes and/or associations.

nesquena commented 12 years ago

Just to update, the source of slowness usually stems from deeply nested child or using many partials. We have plans to improve performance and have already begun work in 0.5.5 prerelease. Closing for now.

calvinl commented 12 years ago

Is this still an issue? I've got 4-5 levels deep of nesting with a very efficient SQL query, and here's my benchmark:

Completed 200 OK in 1475ms (Views: 1427.4ms | ActiveRecord: 4.5ms)

That's pretty slow. Any way I can speed that up?

nesquena commented 12 years ago

Can you try it with 'yajl-ruby' in your Gemfile and see if it makes any difference using a faster json seriaiizer

calvinl commented 12 years ago

Just tried that. It seems to render in roughly the same amount of time. How do I know for sure it's using yajl-ruby?

This is a fairly large JSON dataset though (~240kb), does it still sound like it's too slow?

calvinl commented 12 years ago

Just tried this with to_json and here's my benchmarks with the exact same dataset, it seem like this performance issue is still present?

to_json, with /yajl: Completed 200 OK in 516ms (Views: 219.1ms | ActiveRecord: 13.1ms)

to_json, without yajl: Completed 200 OK in 320ms (Views: 174.2ms | ActiveRecord: 4.7ms)

With rabl: Completed 200 OK in 1422ms (Views: 1172.7ms | ActiveRecord: 12.5ms)

With rabl/yajl: Completed 200 OK in 1431ms (Views: 1123.0ms | ActiveRecord: 12.7ms)

calvinl commented 12 years ago

By the way, I'm running version 0.6.14.

nesquena commented 12 years ago

It is and it isn't. A lot of improvements have been made but it does look like in this case (with a large dataset and likely many nested child blocks) rabl is about 3x as slow as to_json in your case. I have done a lot of benchmarks and found depending on the case rabl is about as fast as to_json and in others can be significantly slower.

Thankfully we have great caching support to make responses much faster since they are automagically cached with minimal configuration. Also as I said in cases where there is a less complex response and less data, you will find rabl is only marginally slower.

That being said, there is obviously a long way to go in performance optimization. rabl-rails is once example at an attempt to make rabl much faster while trading some flexibility. I also have some plans to make rabl much faster after having done some performance tracing recently. This is an ongoing process and hopefully a community effort to make rabl as fast as possible. I sincerely hope to continue closing the gap with to_json in the near future.

calvinl commented 12 years ago

Thanks Nathan,

I'm going to give caching a shot. I love the flexibility that rabl offers, and I sincerely thank you for all the work you've put into it!

agibralter commented 11 years ago

I'm finding that newrelic is reporting that on some slow requests, one of my rabl templates is sometimes taking 13 seconds to render. Looks something like this:

# main.json.rabl
object(false)

node(:is_xdomain) do
  host_page_url =~ %r{^https?://#{Regexp.escape(Settings.host)}/} ?
    false : true
end

node(:lang)            { current_post.lang }
node(:name)            { current_post.name }
node(:is_auto_approve) { current_post.auto_approve? }
node(:fb_app_name)     { Settings.facebook_namespace }
node(:fb_app_id)       { Settings.facebook_app_id }
node(:post_id)         { current_post.id }
node(:root_url)        { root_url }

node(:copy) do
  {
    labels:         partial('labels', object: current_post),
    info_login:     link_to(t(:info_login, t_opts), login_path_with_back),
    info_signup:    link_to(t(:info_signup, t_opts), signup_path_with_back),
    info_profile:   link_to(t(:info_profile, t_opts), profile_path),
    info_logout:    link_to(t(:info_logout, t_opts), logout_path),
    fbshare_title:  t(:fbshare_title, t_opts),
    fbshare_button: t(:fbshare_button, t_opts),

    account_required: t(:account_required, t_opts({
      login:  link_to(t(:message_login, t_opts), login_path_with_back),
      signup: link_to(t(:message_signup, t_opts), signup_path_with_back)
    }))
  }
end

node(:pies) do
  pie_hash = {"no-data" => image_tag("pies/pie60-ND.png")}
  (0..100).each { |i| pie_hash[i] = image_tag("pies/pie60-#{i}.png") }
  pie_hash
end

# labels.rabl
%w(foo bar baz).each do |label|
  node(label) do |obj|
    t(label, scope:  [posts, :labels], locale: obj.lang)
  end
end

I know I can use caching extensively here, but any thoughts as to why this might be happening?

nesquena commented 11 years ago

Can you try the request in development and verify that the response is that slow? Based on the template there, I would expect it to render in under a second. If you can confirm this in development, could you try removing nodes piece by piece until the speed goes back to a normal amount of time? I am just as curious as you as to why that would take 13 seconds.

databyte commented 11 years ago

I used to throw in puts statements into RABL at the builder loop to find slow calls. It's usually not RABL itself but some weird n+1 or external call. I still have Issue #322 open to investigate issues with extends.

agibralter commented 11 years ago

In development it's fine... and most of the time in production it's fine. But for the slowest requests, newrelic is telling me that rabl view (and not others) is taking a long time... but it's not breaking down the request any further than main.json.rabl.

I basically have a Haml view that looks like this:

!!! 5
%html
  %head
    %meta{charset: 'UTF-8'}

    = csrf_meta_tag

    = stylesheet_link_tag(:application_widget, media: 'all')
    = google_fonts_droid

    = render('/layouts/google_analytics')

    = javascript_include_tag(:jquery)
    = javascript_include_tag(:application_widget)

  -# See U::WidgetsHelper#body_data
  %body{data: body_data}

and a helper that has this:

def body_data
  # For some reason, passing `:formats` does not seem to work consistently...
  old_formats = formats
  self.formats = [:json]
  result = {
    widget:      render(template: 'widgets/main'),
    stats:       render(template: 'widgets/stats'),
    bootstrap:   render(template: 'widgets/bootstrap'),
    uncacheable: render(template: 'widgets/uncacheable')
  }
  self.formats = old_formats
  result
end

It's always main.json.rabl that takes 10+ seconds. And the others have SQL calls (e.g. bootstrap.json.rabl) that are broken down by newrelic and show sub 100 ms times.

nesquena commented 11 years ago

Do you have garbage collection monitored on new relic? https://newrelic.com/docs/ruby/ruby-gc-instrumentation

databyte commented 11 years ago

I've found that NewRelic doesn't break down view code well. RABL is a bit of a black box to it. So if you have something like:

attribute :slow_ass_call

# and

def slow_ass_call
  sleep 5
end

It will say your view is 5 secs slower even though it's really the model's fault. It doesn't show the root cause, just the caller (e.g. RABL)

agibralter commented 11 years ago

@nesquena ah, interesting, I don't think I do -- I'll do that for the next deploy. Does it impact performance?

@databyte true -- I'll see if I can add some custom instrumentation to the methods called by main.json.rabl. I doubt it's rabl that's causing this strangeness.

Thank you for the help!

databyte commented 11 years ago

Yeah, I have "add instrumentation" via ActiveSupport on my list of things to do.

nesquena commented 11 years ago

GC monitoring I have enabled in many of my apps on newrelic, haven't noticed any notable performance degradations. I recommend adding that in regardless of this issue, it can be quite useful to understand. As an aside, I would also recommend enabling request queue measuring and middleware measuring as well https://newrelic.com/docs/features/tracking-front-end-time

agibralter commented 11 years ago

Ok so I'm actually realizing that this is happening in development too. It only happens on the first request that my app responds that includes the rabl renders. The first rabl render takes about 9 seconds... subsequent requests are all fast. And it seems to be a call to t() in the rabl template... weird thing is that it's inconsistent about which one it hangs on: I've commented in and out different t() calls and the 9 seconds extra load time is sporadic. Seeming like this may be in fact some GCing that's going on right around then in the request cycle.

nesquena commented 11 years ago

9 seconds is too long for a GC to account for. If you remove the 't' calls but keep as much of the rest as possible can you confirm the 9 second delay disappears? This is kind of an interesting issue I would like to understand better.

agibralter commented 11 years ago

Yeah even commenting out the t calls I still get long load time. I've been fiddling with the GC settings (I'm on Ruby 1.9.3). By default, I see before the rabl renders on the first request:

Total time in GC: 5.031407999999995
GC 312 invokes.

After rabl renders:

Total time in GC: 11.179916000000002
GC 416 invokes.

Then I did some tweaking at set:

RUBY_HEAP_FREE_MIN=4960           # 4,960 (11x 4,096)
RUBY_HEAP_MIN_SLOTS=5000000       # 5,000,000 (500x 10,000)
RUBY_HEAP_SLOTS_INCREMENT=5000000 # 5,000,000 (500x 10,000)
RUBY_HEAP_SLOTS_GROWTH_FACTOR=1   # 1
RUBY_GC_MALLOC_LIMIT=800000000    # 800,000,000 (100x 8,000,000)

And found before rabl renders (on first request):

0.7419139999999649
GC 14 invokes.

And after:

1.4072359999999433
GC 19 invokes.

Subsequent requests show very little time in GC.

nesquena commented 11 years ago

Interesting stuff, thanks for sharing. So by tweaking GC settings it seems like the amount of GC time went from 6 seconds (5 -> 11) to like half a second on first request. Impressive difference.

agibralter commented 11 years ago

Yeah, so I think what was happening was out app's traffic subsided, and the busy unicorn workers got sleepy and died off... Our average time was thus being hurt by the fact that so many "boots" we required relative to the total amount of requests serviced and I started getting negative newrelic alerts.

I'm going to try out these tweaked settings with something like this: https://gist.github.com/1627021 and see what happens... who knows how it will affect the rest of the app's performance though :)

Will keep you posted -- thanks again for the help

nesquena commented 11 years ago

Makes sense, I can see that happening and the initial time is the boot time plus the initial GC allocation. The Ruby GC is not by default setup with enough memory for most ruby/rails apps. That's why GC tuning can have a very significant positive impact on performance. Let me know how tweaking this in production goes.