AssetSync / asset_sync

Synchronises Assets between Rails and S3
1.88k stars 346 forks source link

Gzip files not being used from cloudfront #153

Closed radanskoric closed 11 years ago

radanskoric commented 11 years ago

Using the asset_sync gem with cloudfront the way it is described in the README will mean that gzip-ed versions of the assets will never be used. Rails will generate normal asset urls (ending in .js, .css, ...) , browser will request them with Accept-Encoding: gzip, ... and Cloudfront will just serve the uncompressed version of the file.

It can be seen when you monitor the traffic on your machine and it is explained in documentation here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ServingCompressedFiles.html

For it to use the gziped versions, Rails app needs to be modified to generate asset urls with .gz version if the client supports gziped assets, but also asset_sync needs to set the Content-Type in S3 metadata of the gziped version to match the original file type (application/javascript, text/css, .. etc) instead of gzip file type.

Am I missing something? Can somebody confirm that they have gziped assets being served from cloudfront with app setup as decribed in asset_sync README?

It seems to me this is a big deal since in most cases downloading a gzipped version from origin will be faster than downloading ungzipped version from cdn edge location.

radanskoric commented 11 years ago

I have definitely confirmed that the gziped files are not being served.

Using gzip_compression = true would probably help however it is not ideal since you are then making the assumption that all clients will accept gzip encoding.

The solution does exist. I'm now using cloudfront distribution with custom origin and taking advantage of the behavior described here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ServingCompressedFiles.html

Basically you first make sure your rails app is correctly serving gzipped assets. I used code from this gist: https://gist.github.com/guyboltonking/2152663

Then you just point the cloudfront distribution to your app and use it as asset host.

That solution doesn't require asset_sync to work. It will not cause problems if you want to use it for other reasons, but it is not necessary.

nateberkopec commented 11 years ago

I'm having a similar problem, but I think your solution is the wrong way around.

So, I have gzip_compression = true set. However, it looks like asset_sync is still uploading both versions of my files to S3 - let's say, styles.css and styles.css.gz. A client request for styles.css to Cloudfront or S3 with Accept-Encoding: gzip causes Cloudfront to send back the non-gzipped styles.css, rather than the gzipped styles.css.gz. I think this is actually the correct behavior.

Why isn't asset_sync automatically overwriting styles.css with styles.css.gz? It also seems from the README that this is what the default behavior is.

If it matters, I'm using turbo-sprockets.

nateberkopec commented 11 years ago

A-ha! I've figured it out! (at least for me).

So, Cloudfront just forwards the accept-encoding header on to the Origin. So, I checked to see what happened when I request styles.css from S3...and, the uncompressed version was returned! A-ha! So S3 doesn't do what some other CDNs do which is to automatically serve the gzipped version, even though the .css version was requested. Asset_sync's default behavior (to circumvent this, I suppose) is to rename the .css.gzip version to .css, solving the problem.

However, my asset_sync install definitely wasn't doing that - it was uploading both versions, every time. What the hell? Then I noticed a funny line in my output:

AssetSync: using default configuration from built-in initializer

https://github.com/rumblelabs/asset_sync/blob/master/lib/asset_sync/engine.rb#L14

Oh! AssetSync wasn't recognizing my config anymore! I had moved the file from the standard location ('config/initializers/asset_sync.rb) into a subfolder, and asset_sync didn't detect it and was ignoring my gzip_compression config flag. Setting ASSET_SYNC_GZIP_COMPRESSION in my environment worked like a charm, and everything's working properly.

So, to answer OP:

Also, is there any way to do a more intelligent subfolder search for the asset_sync config? I can't be the only guy who organizes our initializers into subfolders.

radanskoric commented 11 years ago

Yes, the problem lies on S3, S3 does nothing regarding the Accept-Encoding headers. Notice that in my comments I mentioned that setting gzip_compression to true would solve the problem. However, you are then making the assumption that all of your clients will be connecting with Accept-Encoding: gzip headers. This might be true, but it is up to you to decide if that tradeoff is acceptable. I decided it isn't.

That is why i decided to point cloudfront to heroku as it's origin. Now, Rails on Heroku out of the box does not support Accept-Encoding: gzip because there is no nginx or apache in front of the rails server. The gist I linked to in my solution fixes that.

What you might be thinking of when you say that my solution is backwards is that a file name something.css should be served and not something.css.gz. That is indeed correct, the name of the file served should be the name requested, and its Content-Type header should be what was requested. However, actual Content and Contet-Encoding header may be gzip if that was requested by the client. If you look at the gist code containing the middleware that adds the desired behaviour you will see that is exactly what it is doing, inspecting request headers and returning the .gz file but under its plain name.

I have asset serving working correctly with the approach I described but my concern is that if someone follows the instructions outlined in asset_sync readme to the letter and everything works as expected they will end up with a set up where gziped assets are not being used and that is little tricky to notice if you are not explicitly looking for it.

nateberkopec commented 11 years ago

@radanskoric http://stackoverflow.com/questions/575290/which-browsers-claim-to-support-http-compression-but-are-actually-flaky What browsers are you supporting that don't support gzip? That's gotta suck, even IE6 SP1 supports it.

In any case, I don't see an issue with asset_sync's behavior in this case, unless you have a PR to change the docs.

radanskoric commented 11 years ago

I'm not supporting browsers that don't support gzip. I should be fine, but a lot of users will be behind campus proxies and firewalls and I'm just not sure how they might handle the http headers and if they might change the Accept-Encoding header, so I'd like to be on the safe side and honor it by returning the plain version if gzip is not requested.

No, there's no issue with asset_sync, it works as advertised, it's just that S3 is not the best choice for Cloudfront origin server in this case. I can set up a PR to change the docs, but it'll have to say something along the lines of: "Without gzip_compression flag enabled you will be always serving the uncompressed version of the asset, but if all you want to do is serve assets from cloudfront, it's easier not to use this gem, but do it like this ...."

That is why I opened this issue, to first check if I'm missing something.

brunogh commented 11 years ago

I am trying to do the solution described by @radanskoric with this gem https://github.com/romanbsd/heroku-deflater (uses that Gist), but something weird is happening.

When cache is enabled (through config.cache_store = :dalli_store), seems that Heroku is responding always without gzip, then the CDN uses that. If I remove that cache, it works, however I cant do that because it is used by the app.

I though that setting "config.serve_static_assets" to false might work, but read that Heroku overrides that config to true.

Did you face any problem like that?

radanskoric commented 11 years ago

@brunogh Yes, I had the exact same problem. Basically you need to make sure that the middleware serving the assets is before the cache in the middleware stack. That way the Rack::Cache middleware will never get hit for static assets.

Since I'm setting up the middleware manually, this is what I'm doing in my environment/production.rb:

  # Serve pre-gzipped static assets
  config.middleware.insert_before(
    "Rack::Cache", Middleware::CompressedStaticAssets,
    paths["public"].first, config.assets.prefix, { 'Cache-Control' => "public, max-age=31536000" })

The middleware is made from the Gist I mentioned in OP. Hope that helps.

resdigitais commented 11 years ago

Awesome! Thanks @radanskoric!

In the case I am using heroku-deflater, which has this init (https://github.com/romanbsd/heroku-deflater/blob/master/lib/heroku-deflater/railtie.rb), should I put "Rack::Cache" after ActionDispatch::Static? That HerokuDeflater::ServeZippedAssets is the same as your Middleware::CompressedStaticAssets.

radanskoric commented 11 years ago

@resdigitais Yes, I think that should be safe to do. If you have a CDN setup for asset files, all of middleware listed in that heroku-deflater initializer has no need for Rack::Cache and actually needs to be in front of it.

Just to be on the safe side, I suggest you manually inspect the listing of your middleware stack for production environment to make sure none of the middleware concerning dyanmic requests has ended up in front or Rack::Cache .

brunogh commented 11 years ago

Cheers! Gave up of the gem, actually because was having problems with Rack version and did the same as you described:

use Middleware::CompressedStaticAssets use Rack::Cache use ActionDispatch::Static

davidjrice commented 11 years ago

Closing, because this is a complete misunderstanding of the gzip setting in asset sync. Asset sync allows for this.

Because of S3's limitations on the Accept-Vary header. We need to replace the .css file with the contents of the .css.gz file. Therefore being absolutely transparent to the rails application. Just because a file extension is .css does not mean it is not gzipped. Check your headers.

tzoro commented 10 years ago

What is currently best way to achieve serving gziped versions of assets from S3 with Rails 4 on Heroku. Default installation & usage does not serve gzip files from S3 bucket. I tried heroku-deflater but it doesn't work as excepted. Thanks !

PikachuEXE commented 10 years ago

@tzoro heroku-deflater I guess it is useful if you serve asset from your app, but has no effect if you use S3. From my understanding, asset_sync (when gzip_compression true) would detect whether there is a file with same name (but with gz suffix) and just upload to the same path while setting Content-Encoding to gzip so that browser would decompress it before using it.

You can see readme for config

Edit: Crap I press Enter before I finished

betocols commented 9 years ago

Hello @radanskoric

I've been trying to implement your solution for serving gzipped files via Cloudfront. Nonetheless I've been getting this error once and again: "NameError: uninitialized constant Middleware" when trying to push to the Heroku repository.

I've put the file "compressed_static_assets.rb" in config/initializers folder and in the config/middleware folder, but it's always the same result.

Sorry for bringing this thread up after it's been closed for such a long time.

radanskoric commented 9 years ago

@betocols I don't think config is on the look up path for autoloading. It would probably work if you put it into lib/middleware, but in order to be completely sure I suggest you explicitly require the file at the top of your production.rb.