voxpupuli / json-schema

Ruby JSON Schema Validator
MIT License
1.52k stars 242 forks source link

JSON::Util::URI.parse memory leak #329

Closed skippy closed 7 years ago

skippy commented 8 years ago

hey folks,

I benchmarked my app, which does JSON Schema validation in a tight loop (10ks of validations per minute per thread, 10m of validations per run), and I noticed the following:

Using the (damn wonderful) stackprof gem, almost all the Addressable URI objects are created by this helper: https://github.com/ruby-json-schema/json-schema/blob/master/lib/json-schema/util/uri.rb#L26. 500m+ Addressable URI objects are created per run, as my json-schemas use $refs quite a bit.

But what is odd is none of these objects are being GC'ed. At a cursory glance of the code, it seems that they should be as #dup is used. However, if I remove the class-level cache and the use of #dup, like so:

module JSON
  module Util
    module URI

      def self.parse(uri)
        # do NOT parse a Addressable::URI; behind the scenes it creates a dup
        # which leaves a reference to the parent object (as it is not duping the 
        # string objects), which causes a memory explosion
        # 
        return uri if uri.is_a?(Addressable::URI)
        Addressable::URI.parse(uri)
      rescue Addressable::URI::InvalidURIError => e
        raise JSON::Schema::UriError.new(e.message)
      end

    end
  end
end

the memory leak is gone. My process went from growing to ~ 1.5GB per run to staying at a consistent ~ 250MB.

I suspect the issue is the @parse_cache combined with the dup. The #dup creates a new object, and within Addressable::URI it calls #dup on the strings. However my understanding is that this is a copy-on-write process (see http://www.justskins.com/forums/memory-behavior-of-string-8195.html), which means that a reference will be kept around to the shared parent object until it is modified. All it would take is one string reference from the parent object (which is never deleted as it is stored within @parse_cache) to stick around to prevent any of the dupped object from being GC'ed. I haven't dived into this a ton or into the C code (outside of benchmarking and patching) but this is what I suspect is going on.

The above patch is marginally slower as it has to parse the strings with regexs (which is slow, plus the regexs used by Addressable::URI are not class variables but created every time, which is slower and less efficient). BUT I think the slight speed hit (which I only benchmarked within my app, so while it wasn't in isolation, within my app it caused less than a 0.01% decrease in speed) is more than acceptable in exchange for plugging a memory leak.

Thoughts?

RST-J commented 8 years ago

Nice work, thanks a lot! I agree with your comment wrt the performance vs. memory tradeoff. @iainbeeston Did you also work on that caching stuff? I know some contributer started it and we changed something in the meantime. What I want to say is, it would be good if someone who knows more about that code would create a PR (or if you'd be up to, you can also open a PR @skippy).

iainbeeston commented 8 years ago

Yes, this is an issue. For performance reasons we cache the result of parsing a string to an addressable URI. The problem is, we never empty out that cache, and it's global, so it never gets GC'd (this is separate to the schema cache, which can be cleared).

Ideally what I'd like to do is centralise all the caches, and make sure that when you set the clear_cache: true all of the caches (including the URI parsing cache) get cleared.

iainbeeston commented 8 years ago

Actually, scrap that, what I'd really like to do is remove caching entirely(!!!) but that's a pipe dream right now

skippy commented 8 years ago

@iainbeeston In my case the cache was just a couple of dozen keys; the problem seemed to be the use of Addressable::URI#dup, which caused each of those keys to hold onto millions of objects. While clearing out the cache after every parse would be nice, it looks like caching here is of minimal benefit from a CPU perspective.

iainbeeston commented 7 years ago

I've made a potential fix for this in #361 - it's not a long-term solution but it solves the immediate issue.

iainbeeston commented 7 years ago

I'm closing this now #361 has been merged. Should be available in the next release

skippy commented 7 years ago

thanks!