markevans / dragonfly-s3_data_store

S3 data store for the Dragonfly ruby gem
MIT License
62 stars 58 forks source link

Use fog-aws instead of fog proper. #16

Closed gaffneyc closed 9 years ago

gaffneyc commented 9 years ago

Fog is going through a process where they split out all of their providers into separate gems. Since dragonfly-s3_data_store is targeted at only aws we can use the fog-aws gem directly. This has the benefit of not requiring applications to pull in all of fog (which is something like 20 gems at the moment).

This should help reduce the memory overhead required for loading fog.

banyan commented 9 years ago

:+1:

fcheung commented 9 years ago

Sounds good to me, although it would be a good idea to warn of this in the changeling - if the user is not aware of this change and if they have an older version of fog in their gem file already (either directly or as a dependency of something else) then they'd end up with the aws bit of fog loaded twice which could conceivably do odd things

speedmax commented 9 years ago

+1

We can either use a fog/aws require explicit service `require "fog/aws/storage", this is possible since this PR https://github.com/fog/fog/pull/1712

This is the benchmark @cainlevy have shared.

require fog:
2.1869

require fog/core:
0.1823

require fog/aws:
0.6524

require fog/aws/storage:
0.2302

Oh my poor memory!

dragonfly/s3_data_store: 19.3555 mb
  fog: 19.2852 mb
    fog/joyent: 6.6797 mb
      fog/joyent/compute: 6.6289 mb
        net/ssh: 6.2969 mb
          net/ssh/transport/session: 3.3906 mb
            net/ssh/transport/algorithms: 2.3555 mb
              net/ssh/transport/kex: 0.7031 mb
              net/ssh/transport/hmac: 0.5586 mb
              net/ssh/buffer: 0.3945 mb
              net/ssh/transport/cipher_factory: 0.3047 mb
            net/ssh/transport/packet_stream: 0.5859 mb
          net/ssh/authentication/session: 1.4102 mb
            net/ssh/authentication/key_manager: 0.6367 mb
          net/ssh/connection/session: 1.0547 mb
            net/ssh/connection/channel: 0.3203 mb
    fog/rackspace: 2.1914 mb
      fog/rackspace/auto_scale: 0.4297 mb
    fog/hp: 1.9258 mb
      fog/hp/block_storage: 0.5195 mb
        fog/hp/core: 0.3711 mb
      fog/hp/storage: 0.3359 mb
    fog/google: 1.1602 mb
      fog/google/compute: 0.7656 mb
    fog/openstack: 1.1016 mb
      fog/openstack/compute: 0.332 mb
    fog/softlayer: 0.7422 mb
    fog/internet_archive: 0.6328 mb
      fog/internet_archive/storage: 0.5898 mb
    fog/ecloud: 0.5977 mb
      fog/ecloud/compute: 0.5977 mb
    fog/cloudstack: 0.3984 mb
      fog/cloudstack/compute: 0.3984 mb
    fog/ibm: 0.3828 mb
    fog/libvirt: 0.3594 mb
      fog/libvirt/compute: 0.3516 mb
    fog/xenserver: 0.3125 mb
      fog/xenserver/compute: 0.3125 mb
fog/aws: 10.7031 mb
  fog/aws/core: 8.4258 mb
    fog/xml: 4.8281 mb
      nokogiri: 4.7969 mb
        nokogiri/xml: 2.3711 mb
          nokogiri/xml/node: 0.4766 mb
        nokogiri/html: 1.0313 mb
          nokogiri/html/element_description_defaults: 0.6328 mb
        nokogiri/css: 0.8047 mb
          nokogiri/css/parser: 0.4219 mb
    fog/core: 3.2188 mb
      fog/core/collection: 0.4648 mb
  fog/aws/compute: 0.3477 mb
  fog/aws/elb: 0.3008 mb
MSchmidt commented 9 years ago

Why not use aws-sdk directly then? Any memory benchmark for this?

speedmax commented 9 years ago

Memory consumption dropped significantly after switching to a branch from collectiveidea. See https://github.com/markevans/dragonfly-s3_data_store/pull/16

Here is the gemfile

gem 'dragonfly-s3_data_store', require: nil, github: "collectiveidea/dragonfly-s3_data_store", branch: "use-fog-aws"

memory benchmark on require using derailed_benchmark

dragonfly/s3_data_store: 1.1953 mb
  fog/aws: 1.1953 mb
    fog/aws/core: 1.1953 mb
      fog/xml: 1.1719 mb
        nokogiri: 1.168 mb
          nokogiri/xml: 0.6797 mb
MSchmidt commented 9 years ago

I've created a fork which requires fog/aws/storage directly as you suggested. It does not improve memory footprint in my benchmarks. I presume because in production only used classes are actually required. Regardless I like to require only the smallest necessary file and this seems to be the one.

https://github.com/MSchmidt/dragonfly-s3_data_store/tree/use-fog-aws (fork also includes support for eu-central-1 AWS region)

speedmax commented 9 years ago

@MSchmidt You should checkout this post to benchmark memory consuption after rails boot http://www.schneems.com/2014/11/07/i-ram-what-i-ram.html

It might be hard to notice in production when you have a lot of gem loaded or multiple web workers.

derailed_benchmark, bumbler gems and this onliner helped me tracking down slow requires and excessive memory usage from rubygems on load.

time bin/rails r 'p $LOADED_FEATURES.length'

I look forward to see how much impact on your rails boot after switching to fog/aws

markevans commented 9 years ago

thanks for this. it's merged in, and I've stuck with require 'fog/aws' rather than require 'fog/aws/storage', as the latter seemed to raise errors, and I'm not sure how fully supported it will be as fog evolves. as for using aws-sdk gem directly, I'd be up for this if it was better in any way, though don't have time to do it myself, so if anyone wanted to do a PR for that and it was demonstrably better (quicker/more robust, etc.) then I'd definitely consider pulling that in Thanks!

nashbridges commented 9 years ago

:+1: