ManageIQ / more_core_extensions

MoreCoreExtensions are a set of core extensions beyond those provided by ActiveSupport.
MIT License
5 stars 23 forks source link

Reduce memory usage of `.store_path` #54

Closed NickLaMuro closed 7 years ago

NickLaMuro commented 7 years ago

Background

By using recursion to loop through the keys for store path, we are creating new objects that are not needed (for the end result of the method) every time with go further down into the stack, because we are required to do a args[1..-1].push for each recursive call, which will create a new array from the subset of the args, and pass it into the next call of .store_path.

By not using recursion, we can completely remove any object allocations that aren't necessary (besides the ones that are needed for the method's spec), and still maintain the same functionality. The added benefit is that this also now not prone to stack level too deep errors because it no longer requires recursion to function.

Benchmarks

Using the changes from https://github.com/ManageIQ/manageiq/pull/15757, and some of the data found after the introduction of @chrisarcand 's PR, https://github.com/ManageIQ/manageiq/pull/15710 , which uses store_path as part of changes that weren't present prior, we can significantly reduce the number of objects allocated in that introduced because of that change (and probably reduce other cases where .store_path is used in the application as well:

before more_core_extensions patch                    after more_core_extensions patch

Total allocated: 68871116 bytes (854336 objects)   | Total allocated: 64535996 bytes (745958 objects)
Total retained:  5688721 bytes (48886 objects)     | Total retained:  5688721 bytes (48886 objects)
                                                   | 
allocated memory by gem                            | allocated memory by gem
-----------------------------------                | -----------------------------------
  17478431  activerecord-5.0.5                     |   17478431  activerecord-5.0.5
  16545419  activesupport-5.0.5                    |   16639019  activesupport-5.0.5
   8340370  more_core_extensions-3.3.0  <<<<<<     |    7629944  ruby-2.3.3/lib
   7629944  ruby-2.3.3/lib                         |    5779348  manageiq-providers-vmware-b45ffbbaefb3
   5779348  manageiq-providers-vmware-b45ffbbaefb3 |    4159069  manageiq/app
   4159069  manageiq/app                           |    3911650  more_core_extensions/lib  <<<<<<
   2428360  activemodel-5.0.5                      |    2428360  activemodel-5.0.5
   2221043  manageiq/lib                           |    2221043  manageiq/lib
   1894087  pending                                |    1894087  pending
   1557828  arel-7.1.4                             |    1557828  arel-7.1.4
    447304  american_date-1.1.1                    |     447304  american_date-1.1.1
    282320  other                                  |     282320  other
     73241  fast_gettext-1.2.0                     |      73241  fast_gettext-1.2.0
     34040  tzinfo-1.2.3                           |      34040  tzinfo-1.2.3
       272  default_value_for-3.0.2                |        272  default_value_for-3.0.2
        40  config-1.3.0                           |         40  config-1.3.0

.fetch_path has a similar issue (I might address that in another PR), which is why there is still a decent amount still being used by it, but this significantly reduces what was being allocated by the gem overall (in this benchmark).

Links

Fryguy commented 7 years ago

Conceptually, this looks good to me. One thing that concerns me is that I don't think we have any specs for when Hashes and Arrays are intermingled, for example, you can do this:

h = {:a => [{:a1 => nil}, {:a2 => nil}]}
h.store_path(:a, 0, :a1, "blah")
# => {:a=>[{:a1=>"blah"}, {:a2=>nil}]}

Can you verify if we have specs covering that and if not, can you add them?

Fryguy commented 7 years ago

cc @bdunne

NickLaMuro commented 7 years ago

Can you verify if we have specs covering that and if not, can you add them?

Pretty sure we do, as I was using the existing specs to drive these changes, and it definitely took me a bit to get there, but I will confirm since I am only basing it on what I remember while working on this until midnight last night 😅

NickLaMuro commented 7 years ago

@Fryguy Looks like you are correct in your assumption. I will add the test as I agree it is very much needed.

That said, curious if you think this should live in a separate test file just for shared/nested.rb, as it doesn't really fall under either a test for Array or Hash. Thoughts?

Fryguy commented 7 years ago

I was actually surprised to find that the existing tests were not already in a spec/shared/nested_spec.rb. Perhaps they should all really live there, but I'm not sure. I'd also be fine with just adding the "cross concerns" specs to a spec/shared/nested_spec.rb

NickLaMuro commented 7 years ago

@Fryguy Thanks for the suggestion, currently working on the specs for this now.


That said, ran into something that probably is an edge case that hasn't been considered, and isn't really tested yet, and I am curious of how we want to handle it moving forward.

Part of .store_paths's functionality is that it will assume the current class is the object we want to use, and if that type of object for the intermediate key isn't present, or more specifically, doesn't respond to :store_path, then it will create a new instance of that object based on the self.class.new. So the following would happen in master:

irb> [{"a" => [1,2,3]}].store_path(0, "b", 0, 4)
#=> [{"a" => [1,2,3], "b" => {0 => 4}]

Because it is implemented with recursion, so the class will change as it goes down the stack. But with my current implementation, self.class.new is still used, but it will always retain the top level object, so the result would change:

irb> [{"a" => [1,2,3]}].store_path(0, "b", 0, 4)
#=> [{"a" => [1,2,3], "b" => [4]}]

I can change this behavior in this PR to match what currently in master, but those aren't tested, and I am honestly not sure which is the true/correct intent.

Fryguy commented 7 years ago

That's so funny...I was just about to post here about the class.new call, and that it doesn't use the type of the object at that nesting.

Fryguy commented 7 years ago

I was aware that it used the type at the nesting. If I recall, part of the reason is we had things like Arrays of Hashes, and HWIA stored inside of regular Hashes.

NickLaMuro commented 7 years ago

I can change this behavior in this PR to match what currently in master

Not sure if this was a suggestion to do this, but regardless, I am going to make the call say we should stick with matching the implementation that currently exists, and reason about whether or not that implementation is a bug in a separate issue/PR.

Should have been what I suggested in the first place since the intent of this PR is strictly improving performance, so sorry for the extra noise by trying to reason about that here.

NickLaMuro commented 7 years ago

Specs added in https://github.com/ManageIQ/more_core_extensions/pull/56

miq-bot commented 7 years ago

Checked commit https://github.com/NickLaMuro/more_core_extensions/commit/8cc6a2a36fc6f2c4f97ebd13740becf9a1749699 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 1 file checked, 0 offenses detected Everything looks fine. :star:

NickLaMuro commented 7 years ago

🎉