Closed redox closed 8 years ago
I've got exactly same problem.
Here is my vanilla rails app, with 100 .coffee
files. It renders at average of 900ms in development environment. Is it supposed to be that long?
https://github.com/mib32/slow-sprockets-example
Exactly the same issue, upgrading from sprockets 2.12 (70ms) to 3.4 (900ms).
I finally got a spare moment to try and repro this. I used the example app from @mib32. I ran with sprockets 2 and then with sprockets 3. The results will SHOCK you (read in news reporter voice).
$ bundle exec rake assets:clobber; time env RAILS_ENV=production bundle exec rake assets:precompile
I, [2015-11-19T09:42:01.013358 #4944] INFO -- : Removed /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets
rm -rf /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/tmp/cache/assets
I, [2015-11-19T09:42:18.361928 #4946] INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-91938166bf4ab99d4165669cd7fb7fbb.js
I, [2015-11-19T09:42:18.427565 #4946] INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-7fa3632c92a28b60e30f9b6b6bf8faf8.css
real 0m17.444s
user 0m13.438s
sys 0m2.944s
$ bundle exec rake assets:clobber; time env RAILS_ENV=production bundle exec rake assets:precompile
I, [2015-11-19T09:43:06.891215 #5245] INFO -- : Removed /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets
rm -rf /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/tmp/cache/assets
I, [2015-11-19T09:43:12.434402 #5248] INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-d879645f2bbe4579d56062dc22ac18b86d5f5e10759eedfada4789a09614cfe4.js
I, [2015-11-19T09:43:12.492377 #5248] INFO -- : Writing /Users/richardschneeman/Documents/projects/tmp/slow-sprockets-example/public/assets/application-0723cb9a2dd5a514d954f70e0fe0b89f6f9f1ae3a375c182f43b5f2b57e9c869.css
real 0m5.623s
user 0m4.654s
sys 0m0.768s
So sprockets 3 is about 3x faster than sprockets 2 here with no cache. That's the opposite of what's reported. I'll give the code provided by @redox a shot.
I'm seeing the same thing with the other code from @redox sprockets 3 is faster than 2 (but not by quite the same difference). How did you instrument your app with ruby-prof?
@schneems Hello Richard, thank you for taking your time on that thing.
I see that you are measuring time that rake assets:precompile
takes. But i personally report about regression of Page load time.
Precompilation and page rendering are not anyhow connected, right? Please correct me if I am getting something wrong.
We might be seeing this - noticed that on first load changing any file takes forever to load. Haven't had a chance to look into it yet.
The same machinery that powers assets:precompile
is running behind the scenes of the middleware. We still have to compile your asset, we just do it on the fly. I was worried we had a regression for the whole thing. Maybe the regression is somewhere in the middleware that serves the request.
I'm less familiar with profiling in a live request and hoping I could repro in a shell script. Tips on debugging would be nice. Like how you got that ruby-prof graph?
@schneems The graph in the top is not mine, but i had the same results, when i profiled literally this way in application.html.haml
:
- RubyProf.start
= javascript_include_tag ...
= stylesheet_include_tag ...
- RubyProf.stop
When i went through a profile, i saw that bottleneck is in the part of code that figures out the filename of a n asset and gives it to renderer. But i couldn't investigate further due to lack of ruby knowledge.
I would like to point out again that it is only in development.
@eileencodes Yep. Especially if you use @import
s. But i think that this is another issue.
Could you perhaps share your Ruby and OS version numbers?
I might be possible certain Ruby versions on certain platforms in combination with a large number of gems (i.e. a large load path) result in such a performance regression. (For reference: https://bugs.ruby-lang.org/issues/10015)
ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-darwin14] on OS X El Capitan (10.11.1)
@jorisvh If I understand correctly you should be affected by that Ruby problem, but it still would not account for that drastic difference between 70ms and 900ms.
Are you by any chance comparing debug and non-debug mode? Because that way I get similar numbers.
This is what I measured at my workstation (Ruby 2.2.3, Windows 10) using https://github.com/mib32/slow-sprockets-example
mode | sprockets 2.12.4 | sprockets 3.3.4 |
---|---|---|
debug enabled | 500 ms | 1100ms |
debug disabled | 50 ms | 80 ms |
initial compilation | 58 s | 8 s (!) |
@mib32 Are you getting about 70 ms with sprockets 2 in debug mode, too? That's where I see render times of 500 ms... :snail: (Might be a Windows issue, though)
Just to be sure: I'm talking about asset debug mode, i.e. config.assets.debug
, not config.cache_classes
or config.eager_load
.
@daniel-rikowski Yeah... You seems to be right, i am unable to see any significant regression no more...
Here's a flamegraph visualization from my app: (rawgit, gist)
The dark blue section taking up 60% of the ~500 ms page load time is Sprockets, despite the fact that multiple requests have already happened, so Sprockets should have all the info it needs to serve the request.
This part of the app is inside Active Admin, which has 11 CoffeeScript files and 41 SCSS files. Our customizations on top of that add 6 CoffeeScript files and 1 SCSS file.
These are the asset settings I have configured for development:
config.assets.debug = true
config.assets.digest = true
config.assets.raise_runtime_errors = true
As expected I'm not seeing this slowdown in production, because there the files are already precompiled.
I can reproduce with the test app that Active Admin generates, as well. I can give steps to set it up if needed.
@seanlinsley Could you please answer couple of questions:
We're seeing this slowdown as well on a large app that was just upgraded from Rails 3.2 to 4.2 with config.assets.debug = true
. Oddly the CSS compilation doesn't seem to affect performance much because we're using SASS and @import statements to create one application.css
file.
The JS file calls murder the speed though. It seems like each 304 return from the server is taking a really long time which adds up to a long page load.
For now we've sped things up a bit by using our own modified version of this: https://github.com/discourse/discourse/blob/master/lib/middleware/turbo_dev.rb The solution feels real sketchy though.
@mib32 did you manage to reproduce something?
I did a big sprockets upgrade and it didn't break anything. I'm very happy about this. :smile:
Dev environment details: I'm experiencing a page loading performance drop after upgrading from sprockets 2.8 to 3.6 (after reading the Speeding up Sprockets blog post). There were a few dependencies linked into it so I ended up upgrading much more than just sprockets.
Notably where I was specifically looking for improvement is with my layout/application
template, as it was the slowest resource.
Resource | Before upgrade | After upgrade |
---|---|---|
layout/application | 9ms | 48ms |
JavaScript | 160ms | 180ms |
CSS | 70ms | 100ms |
The only other notable changes I did for the assets with the upgrade is change the file extensions from .js.coffee
to .coffee
and .css.scss
to .scss
. I have not created any manifests to help with the assets (after reading the documentation I haven't grasped the usefulness of this or the howto of it). Perhaps the manifest is my missing piece for performance?
I've noticed my flame graph showed more time being spent in sprockets as well.
I realize that this may be because of some side effect from the other gems. I'll do some more testing to determine this. And I'll get back to you.
Alright I did some digging and here's my derailed stackprof of slow methods
Booting: development
Endpoint: "/"
user system total real
100 requests 26.830000 1.780000 28.610000 ( 28.866952)
Running `stackprof tmp/2016-06-09T00:42:10-04:00-stackprof-cpu-myapp.dump`. Execute `stackprof --help` for more info
==================================
Mode: cpu(1000)
Samples: 7184 (0.03% miss rate)
GC: 1013 (14.10%)
==================================
TOTAL (pct) SAMPLES (pct) FRAME
1894 (26.4%) 1894 (26.4%) Pathname#chop_basename
1104 (15.4%) 937 (13.0%) Hike::Index#entries
1314 (18.3%) 427 (5.9%) BetterErrors::ExceptionExtension#set_backtrace
391 (5.4%) 386 (5.4%) Sprockets::Mime#mime_types
1466 (20.4%) 305 (4.2%) Pathname#plus
1384 (19.3%) 272 (3.8%) BindingOfCaller::BindingExtensions#callers
402 (5.6%) 205 (2.9%) Hike::Index#find_aliases_for
373 (5.2%) 175 (2.4%) Hike::Index#sort_matches
1628 (22.7%) 162 (2.3%) Pathname#+
234 (3.3%) 117 (1.6%) ActionView::PathResolver#find_template_paths
126 (1.8%) 101 (1.4%) Hike::Index#build_pattern_for
206 (2.9%) 98 (1.4%) Hike::Index#pattern_for
1696 (23.6%) 93 (1.3%) Hike::Index#match
65 (0.9%) 65 (0.9%) Hike::Trail#stat
118 (1.6%) 63 (0.9%) Hike::Index#initialize
2454 (34.2%) 62 (0.9%) Pathname#join
59 (0.8%) 59 (0.8%) Time#compare_with_coercion
57 (0.8%) 52 (0.7%) ActiveSupport::FileUpdateChecker#watched
5345 (74.4%) 52 (0.7%) Sprockets::Rails::Helper#check_errors_for
760 (10.6%) 47 (0.7%) Pathname#relative?
45 (0.6%) 30 (0.4%) Sprockets::Engines#deep_copy_hash
131 (1.8%) 25 (0.3%) ActiveSupport::FileUpdateChecker#max_mtime
88 (1.2%) 21 (0.3%) Sprockets::Asset#dependency_fresh?
39 (0.5%) 18 (0.3%) ActiveSupport::Inflector#camelize
18 (0.3%) 18 (0.3%) String#blank?
18 (0.3%) 18 (0.3%) ActionView::Helpers::AssetUrlHelper#compute_asset_extname
17 (0.2%) 17 (0.2%) ActiveSupport::SafeBuffer#initialize
82 (1.1%) 16 (0.2%) ActiveRecord::Migrator.migrations
108 (1.5%) 14 (0.2%) ActionView::Helpers::AssetUrlHelper#asset_path
14 (0.2%) 14 (0.2%) ThreadSafe::NonConcurrentCacheBackend#[]
Booting: development
Endpoint: "/"
user system total real
100 requests 34.960000 2.740000 37.700000 ( 37.947141)
Running `stackprof tmp/2016-06-09T00:05:11-04:00-stackprof-cpu-myapp.dump`. Execute `stackprof --help` for more info
==================================
Mode: cpu(1000)
Samples: 9465 (0.00% miss rate)
GC: 1432 (15.13%)
==================================
TOTAL (pct) SAMPLES (pct) FRAME
1261 (13.3%) 1003 (10.6%) Sprockets::PathUtils#entries
702 (7.4%) 702 (7.4%) URI::RFC3986_Parser#split
611 (6.5%) 611 (6.5%) Sprockets::PathUtils#path_extnames
600 (6.3%) 600 (6.3%) Sprockets::PathUtils#stat
669 (7.1%) 399 (4.2%) Sprockets::URITar#initialize
341 (3.6%) 341 (3.6%) URI::RFC2396_Parser#escape
1164 (12.3%) 283 (3.0%) Sprockets::PathUtils#match_path_extname
270 (2.9%) 270 (2.9%) Sprockets::Paths#root
1419 (15.0%) 235 (2.5%) Set#initialize
225 (2.4%) 225 (2.4%) Set#add
187 (2.0%) 187 (2.0%) URI::RFC2396_Parser#unescape
184 (1.9%) 167 (1.8%) Sprockets::DigestUtils#digest
151 (1.6%) 151 (1.6%) #<Module:0x0000000198d748>.load_with_autoloading
453 (4.8%) 148 (1.6%) Set#merge
950 (10.0%) 147 (1.6%) Sprockets::URIUtils#split_file_uri
1163 (12.3%) 140 (1.5%) Sprockets::Resolve#parse_path_extnames
7286 (77.0%) 118 (1.2%) Sprockets::CachedEnvironment#initialize
116 (1.2%) 116 (1.2%) Sprockets::PathUtils#split_subpath
439 (4.6%) 114 (1.2%) Sprockets::URIUtils#join_file_uri
218 (2.3%) 109 (1.2%) ActionView::PathResolver#find_template_paths
153 (1.6%) 104 (1.1%) Sprockets::HTTPUtils#find_q_matches
129 (1.4%) 100 (1.1%) Sprockets::Mime#extname_map
188 (2.0%) 98 (1.0%) Sprockets::URITar#expand
2024 (21.4%) 95 (1.0%) Sprockets::Dependencies#resolve_dependency
89 (0.9%) 89 (0.9%) Sprockets::PathUtils#absolute_path?
719 (7.6%) 59 (0.6%) Sprockets::CachedEnvironment#stat
68 (0.7%) 57 (0.6%) ActiveSupport::FileUpdateChecker#watched
52 (0.5%) 52 (0.5%) Time#compare_with_coercion
195 (2.1%) 44 (0.5%) Sprockets::EncodingUtils#unmarshaled_deflated
39 (0.4%) 39 (0.4%) block (2 levels) in <class:Numeric>
So my app is faster with the older sprockets: (from flame graph info of page load)
sprockets-rails-2.3.3 (431 samples - 57.09%)
sprockets-2.12.4 (417 samples - 55.23%)
sprockets-rails-2.3.3 (610 samples - 63.34%)
sprockets-3.6.0 (601 samples - 62.41%)
It looks like way more sprocket method calls are being done and the app is spending more time in it.
This is a really long thread. I'm working on performance again right now. Did anyone include any slow example apps on this thread?
This is a really long thread. I'm working on performance again right now. Did anyone include any slow example apps on this thread?
You can use https://github.com/redox/perf_regression_sprockets -> I tried to make it pretty straight forward. Let us know how that goes @schneems 👍
I've just run into this issue too on a newly added, asset heavy page on our site.
Here's my current reproduction: https://github.com/nfm/sprockets-3.x-performance-regressions
The meat of it is this super simple rake task:
namespace :sprockets do
task benchmark: :environment do
asset_count = 1500
puts "Creating #{asset_count} empty assets..."
asset_count.times { |i| `touch #{Rails.root}/app/assets/images/#{i}.png` }
puts "Done"
puts "Precompiling assets..."
Rake.application.invoke_task "assets:precompile"
puts "Done"
puts "Running benchmark..."
start = Time.now
count = 1500
count.times { ActionController::Base.helpers.asset_path('1.png') }
puts "#{count} lookups of the same asset took #{(Time.now - start) / 1.second}"
start = Time.now
count = 1500
count.times { |i| ActionController::Base.helpers.asset_path("#{i}.png") }
puts "#{count} lookups of #{count} different assets took #{(Time.now - start) / 1.second}"
end
end
Otherwise it's just a newly initialized Rails 4.2.6 app, with two branches: one for Sprockets 2.x and one for Sprockets 3.x.
For me, on Sprockets 3.x branch:
1500 lookups of the same asset took 6.038463876
1500 lookups of 1500 different assets took 6.059824304
And on Sprockets 2.x branch:
1500 lookups of the same asset took 0.069654396
1500 lookups of 1500 different assets took 0.064596837
It seems to be related to the number of assets that Sprockets is managing - nothing to do with their content, or whether lots of them are looked up or not. I would have expect to see O(1) lookup times for any number of assets in the repo, but for some reason 3.x is two orders of magnitude slower than 2.x.
@redox Not to be too over zealous but I think I already fixed the performance regression in Sprockets 3. When I use Sprockets 3.2.0 with your app I see a page load take about 400ms. (using rack-mini-profiler) When I use Sprockets 3.6.2 it takes about 100ms. Can you try upgrading and let me know how it looks.
@nfm i haven't looked at your app yet, it's already using Sprockets 3.6.2. I'll need to dig in more.
We noticed a large performance improvement upgrading to sprockets-rails
3.x from 2.x. It's not perfect but way better than before. We did also go to sprockets
3.6 from 3.5.2 but I believe it was sprockets-rails
that fixed the issue
@nfm I found one place where it was spending a bunch of time, thanks for the example app. It's not back to 2.x speeds but I was able to get a 400% perf improvement with https://github.com/rails/sprockets/pull/331
There is some logic that was added to 3.x that will allow for multiple assets to be matched. It's expensive in large directories. I don't know why it was added and I don't feel comfortable tearing it out without more context. So after that patch gets green and is merged to master I can backport to 3.x and cut a release. It is still slower than 2.x but at least it's less slow.
Turns out it's perhaps faster, but not by 400% made a mistake while benchmarking.
@schneems thanks for taking a look :)
Interesting thought about things being slow if there are many assets in one directory. I've updated my repo to add another benchmark where each asset is in its own directory. Probably the most realistic use-case is having a few directories with a reasonable number of files (rather than one directory with many files, or many directories with one file), but I think these two benchmarks illustrate the problem.
Here's my latest benchmarks:
Sprockets 2.x:
# ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 0.086173673
1500 lookups of 1500 different assets took 7.336354235
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 0.043988341
1500 lookups of 1500 different assets took 1.415721443
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 0.109657337
1500 lookups of 1500 different assets took 0.10572148
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.094545266
1500 lookups of 1500 different assets took 0.087344999
Sprockets 3.x:
# ./bin/rake sprockets:benchmark
1500 lookups of the same asset took 7.827039138
1500 lookups of 1500 different assets took 6.186335159
# ./bin/rake sprockets:benchmark_dirs
1500 lookups of the same asset took 1.993022409
1500 lookups of 1500 different assets took 0.587485463
# ./bin/rake sprockets:benchmark PRECOMPILE=1
1500 lookups of the same asset took 6.232709039
1500 lookups of 1500 different assets took 6.381696028
# ./bin/rake sprockets:benchmark_dirs PRECOMPILE=1
1500 lookups of the same asset took 0.729864224
1500 lookups of 1500 different assets took 0.631234396
So it does look like the directory structure makes a significant difference without precompiling, which intuitively feels right. But the interesting discrepancy is the difference the directory structure makes after precompiling in sprockets-2.x vs sprockets-3.x. The performance should be the same for both directory structures, and it is in sprockets-2.x, but in sprockets-3.x many files in one directory is much slower. Something weird is going on!
Ok, so it looks like most of the weirdness is me misunderstanding Sprockets!
Precompiling the assets is not enough to have the benchmark behave like Sprockets will in production. The benchmark was running in the development
environment where config.assets.debug
is set to true, and as a result assets were being looked up by Sprockets::Rails::HelperAssetResolvers::Environment
instead of Sprockets::Rails::HelperAssetResolvers::Manifest
- the precompiled assets seem to be ignored.
Our issue in production turned out to be that config.assets.compile
was accidentally set to true
. This doesn't default to false
in production; you do need to explicitly set it to false in config/environments/production.rb
(Rails generates this config for you, but we had inadvertently removed it thinking it was redundant).
Our manifest was being used to look up assets, but some asset paths were being provided without file extensions (eg. image_path('some-image')
instead of image_path('some-image.png')
), which was causing them to not be found in the manifest, and they were falling back to being compiled at request time.
If you're having performance issues, my take-away is to check Rails.application.config.assets
in production, and confirm the following settings:
compile
should be set to false
. This isn't done by default - it has to be specified in your config/environments/production.rb
.debug
should be set to false
. This is done by default, so it will only be true
if you've added it to your config/environments/production.rb
, which is pretty unlikely.digest
should be set to true
. This is done by default, so it will only be false
if you've added it to your config/environments/production.rb
, which is pretty unlikely.resolve_with
should be set to [:manifest]
. If compile
, debug
, and digest
are configured correctly, sprockets-rails
will configure this correctly for you.@mfm yes, keeping compile=true
is a big hit on performance. I added a note to an issue https://github.com/rails/sprockets/issues/327#issuecomment-229354104 that could have prevented your case from happening, if we raised when you're looking up an asset without an extension. I was able to successfully use your test script to make quite a few improvements in master and one on the 3.x branch. Thank you very much for taking the time to write the script and to share it.
On the script: i noticed the manifest behavior of your script with sprockets 2.x and removed it in my local tests (basically if you have a manifest file sprockets 2.x will use that skipping the asset lookup logic which is why it was so fast).Your script did help me find a performance regression and ultimate fix I have a PR #336 that will drastically improve performance.
That being said there is one core difference in behavior between 3.x and 2.x. Sprockets 2.x caches individual asset lookups so when you do the second lookup you hit a cache. Sprockets 3.x does not do this but rather it caches directory entries instead. I don't know why that decision was made. It is true that even with the #336 patch applied that it is still not as fast as 2.x in repeatedly looking up the same asset, however in the second "random" asset lookup test it is actually faster than 2.x. I want to play with the format that we use to store the entries, as a hash lookup could prove to be much faster than looping through all the entries. Until I get the time to do that you should see improved speed in development with #336. When tests are green I plan on merging and cutting a release.
I cut a release 3.6.3 please try it out. Going to close this thread for now. Thank you all for providing feedback and example scripts.
Thanks @schneems, that all makes sense. Nice work with the performance fix! Glad to be able to help out :)
I just reran rake sprockets:benchmark
3 times per branch
with latest sprockets
(see versions in headings below)
1500 lookups of the same asset took 0.238217 1500 lookups of 1500 different assets took 9.4763
1500 lookups of the same asset took 0.254395 1500 lookups of 1500 different assets took 9.344158
1500 lookups of the same asset took 0.25907 1500 lookups of 1500 different assets took 9.132838
1500 lookups of the same asset took 9.655606 1500 lookups of 1500 different assets took 2.507665
1500 lookups of the same asset took 5.900586 1500 lookups of 1500 different assets took 3.093761
1500 lookups of the same asset took 6.441615 1500 lookups of 1500 different assets took 2.9105
I hope "lookups of the same asset" can take less time :(
So here's the deal. Sprockets 3 is MUCH faster at looking up an individual non-cached asset because of the way caching changed between 2 and 3. In Sprockets 2 the individual assets are cached, which is not all that useful unless you're using the same asset a ton. In that case it is much much faster in sprockets 3. However if you need different assets, it is really slow. In Sprockets 3 we cache entries in directories instead of the individual assets. This is faster for multi asset lookups. Basically a non-cached asset lookup on Sprockets 3 is faster than on Sprockets 2.
If we wanted that speed back for the same asset we would have to add another layer of caching back to Sprockets 3 which would actually decrease the performance of looking up a non-duplicate asset call as we're having to add a cache check that will miss frequently. Invalidating the individual asset is also a concern.
I'm not opposed to making the same asset lookup time better, but we can't do it at the expense of a non-duplicate asset lookup.
We have about 300 CSS & JS files in our application. Our html head is contained in a partial. In development mode this partial renders in 60ms with sprockets 2. With sprockets 3 it takes 100ms with debug: false and 700ms with debug: true. With debug I'm referring to asset debugging.
Since you can't use sprockets 2 with Rails 5 it seems our current options are either to use debug: false or move over to the webpacker gem.
Hi guys,
I'm following rails/sprockets-rails#225 and I would like to share with you a performance regression I'm experiencing after moving from sprocket-rails
2.1.3
to3.0
ormaster
.It turns out that the issue is not in
sprocket-rails
but rather in the underlyingsprockets
since most of the time is spent resolving asset paths (cc @eileencodes).Here is what I'm seeing on my personal app (ruby-prof output):
And here is a small project reproducing the issue: redox/perf_regression_sprockets. In that application a page that took 20ms to be rendered using sprockets-rails
2.1.3
now takes 60ms with the latestmaster
branch.Any chance it could be related to the "compat" resolving layer? Let me know if I can help!