ohler55 / oj

Optimized JSON
http://www.ohler.com/oj
MIT License
3.14k stars 251 forks source link

Memory leak introduced in 3.12.0 #696

Closed normelton closed 3 years ago

normelton commented 3 years ago

We've been chasing a memory leak in Oj. It seems that our project runs fine with 3.11.8, but upgrading to 3.12.0 (or anything thereafter) results in a slow memory leak. After about 24 hours, all the memory is used up and our process crashes.

Most frustrating, I can't seem to reproduce this in a test case. Equally frustrating, other projects that use Oj in the same fashion don't have a problem.

I know we have nothing really to go on, is there a recommended path for diagnosing a memory leak in production?

normelton commented 3 years ago

If it helps ...

Oj.default_options = {:mode => :rails }

ohler55 commented 3 years ago

A caching option was introduced in 3.12. It gives a nice speedup when the input hash keys repeat but if the hash keys continue to change then memory will be used for each new key. The cache option can be turned off with the :cache_keys and :cache_strings option.

I am in the process of updating the caching to expire cached keys and strings if they are not used for a few GC cycles. That will most likely solve the issue you are seeing while still being able to bet the performance boost from caching.

Let me know it that takes care of the issue. If not I'll look more and if need be create a debug version for you.

normelton commented 3 years ago

Ahh thanks, I'm disabling the cache now and will let it run. Should know for sure by tomorrow morning.

Thanks for the quick response!

normelton commented 3 years ago

Okay, I've got:

Oj.default_options = {:mode => :rails, :cache_keys => false, :cache_strings => false }

But still see memory usage growing. With 3.11, system RAM usage hovers around 1GB, it's currently at 2GB and trending upward.

ohler55 commented 3 years ago

:cache_strings should be a number. Set it to -1 for no caching.

Did the change help some?

ohler55 commented 3 years ago

If you are interested I have a branch that includes the expiring cache mentioned earlier. If you have a place to try it that would be great.

normelton commented 3 years ago

Well ...

Oj.default_options = {:mode => :rails, :cache_keys => false, :cache_strings => -1 }

... and memory is still growing. This is on 3.12.0. I'll try the expire-cache branch and let you know what we find.

Thanks!

ohler55 commented 3 years ago

Thank you. Is the memory growing less or the same? If the same we might be looking in the wrong place.

normelton commented 3 years ago

Seems to be a similar trajectory: https://imgur.com/a/Hh7QdPg. The relatively calm period between 8/20 and 8/23 was when we were testing older versions.

ohler55 commented 3 years ago

Looks like it must be something other than the cache. I'll try to put together a test to recreate. Can you characterize the data? Similar JSON or completely different each time?

ohler55 commented 3 years ago

I had assumed the issue was on parsing or loading but does the app also write or dump to JSON?

ohler55 commented 3 years ago

I created a test that does show a leak and I patched that up in the expire-cache branch. There still seems to be a trickle from somewhere that I and tracking down but if you want o give the branch a try again it should be better and maybe even fix the leak over longer periods of time.

ohler55 commented 3 years ago

On Linux I see no leak at all on the test.

normelton commented 3 years ago

Sorry for the radio silence ... I’m trying that branch now.

Similar, but not identical JSON. Both parsing and dumping. What’s even stranger, I have identical code in other scripts, without problems. All the scripts use the same base library, which includes all the JSON handling.

Regardless, double kudos for the help, will let you know in the morning!

ohler55 commented 3 years ago

Thanks. Glad to help. Solving the issue will help others as well.

normelton commented 3 years ago

Well ... I created an empty Gemfile with:

gem "oj", git: "git://github.com/ohler55/oj", :branch => "expire-cache"

On my mac laptop, the gem installs, loads, and seems to run fine. On my Linux server where I'm testing this, when I require the gem, I get:

LoadError: /opt/health-device-monitor/vendor/bundle/ruby/2.7.0/bundler/gems/oj-00400a194430/lib/oj/oj.so: undefined symbol: oj_set_parser_debug - /opt/health-device-monitor/vendor/bundle/ruby/2.7.0/bundler/gems/oj-00400a194430/lib/oj/oj.so

ohler55 commented 3 years ago

Added back the file. Sorry.

normelton commented 3 years ago

We're definitely on the right track:

https://imgur.com/a/8Wf4qBH

There's a very slight upward trend, but let's see how that levels off over the weekend?

Thanks!

ohler55 commented 3 years ago

That is looking good. Good to see.

normelton commented 3 years ago

Good news, memory usage has remained stable over the weekend..

https://imgur.com/a/VgzpAKP

ohler55 commented 3 years ago

Excellent. I'll merge into develop. Maybe do a release. I have to double check a few things first though.

ohler55 commented 3 years ago

Released!

ohler55 commented 3 years ago

Can this be closed?

normelton commented 3 years ago

Yep, thanks again for the help!

On Sun, Sep 12, 2021 at 6:37 PM Peter Ohler @.***> wrote:

Can this be closed?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ohler55/oj/issues/696#issuecomment-917723235, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAHMFULVN2BTIQ7SXDAJVTUBUTSRANCNFSM5CVJNBBA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.