bolerio / mjson

Lean JSON Library for Java, with a compact, elegant API.
Apache License 2.0
83 stars 24 forks source link

Memory leak in Json$NullJson #27

Open Macok opened 7 years ago

Macok commented 7 years ago

You are creating a topnull object in Json$NullJson class. static NullJson topnull = new NullJson(); This instance is static and therefore, will never be garbage collected. Every time some json data containing null fields is processed, the enclosing field of this object is growing.

You can reproduce the problem running this code:

package mjson;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.stream.IntStream;

public class Main {
    public static void main(String[] args) throws FileNotFoundException, InterruptedException {
        String content = new Scanner(new File("resources/file.json")).useDelimiter("\\Z").next();
        Scanner in = new Scanner(System.in);

        while(true){
            in.nextLine();
            IntStream.range(0, 100000).forEach(i -> {
                Json.read(content);
            }); 
            System.out.println("SIZE: "+ Json.NullJson.topnull.enclosing.asList().size());
        }
    }
}

Used file.json:

{
    "whatever1": null,
    "whatever2": null,
    "whatever3": null,
    "whatever4": null,
    "whatever5": null,
    "whatever6": null,
    "whatever7": null,
    "whatever8": null,
    "whatever9": null
}

On the screenshot below, you can see what happens on the heap. I sequentially run multiple Json.read() calls and then trigger GC. After just few runs, heap size is already above 0.5GB. leak

Quick fix: With @KamilKrol we introduced a quick fix, by removing the topnull field and creating new NullJson instance every time DefaultFactory#nil() is called. This seems to be solving the leak problem, but the proper solution probably requires investigating why NullJson#enclosing field is not empty after processing is finished.

bolerio commented 7 years ago

Great catch @Macok, thanks! Will try to make a 1.4.2 patch release ASAP.

slaskawi commented 7 years ago

@Macok Good catch!

But I think you can not detect when the processing has ended. A user might use Json object inside a single method or cache it in a field (and reuse it). Having this in mind I believe the only way is to remove the static topnull field.

slaskawi commented 7 years ago

Proposed fix. @Macok could you please have a look?

Macok commented 7 years ago

@slaskawi Thanks, this is exactly what we did (at least from what I see, because whitespace changes make your diff hard to read). It works for us, but would be nice to hear @bolerio's opinion about it :)

bolerio commented 7 years ago

@Macok and @slaskawi thanks again for chasing this issue! Managing parents has turned out problematic in several ways - first there's multiples of them so when you do 'up' you can get an array of parents some times and others just a single parent. I couldn't think of an elegant way to deal with the ambiguity so just pushed it to the user. And now the present issue makes me realize that there is a bit of a conceptual problem as well - shouldn't all primitive values (numbers, boolean and strings) share the same parents since they are immutable and therefore should pretty much behave like reference from a user's perspective?

So maybe it's a good idea to remove parents altogether and deprecate the up method. The main benefit I've personally had with the 'up' method is a more concise, single-statement, initialization of more complex Json structures. Otherwise, passing a Json object in some context and looking up the parent from there is probably a code smell - why would one need to know how owns a value if one is only supposed to be interested in the value?

void doSomething(Json value) {
   Json parent = value.up();
   // code smell - if the method needs the parent in the first place, then it should take the parent
  // as an argument, not the child Json.
}

Other than that, @slaskawi the changes look good. If you make a PR against the 1.4 branch, I'll merge and create a 1.4.2 patch release.

Thanks again! Boris

slaskawi commented 7 years ago

@bolerio I'm also a pretty big fan of parsing things top/down rather than bottom/up. Thus the up method is not required to me and if it's problematic and causes memory leaks, I would deprecate it in 1.4 and remove it in 2.0.

As I pointed out - you never know if someone is storing a reference in a field or does everything within a method (and with a little help of escape analysis a reference might be put on stack). So I would definitely try to remove all static stuff from this library.

And sure, I'll file a backport within a few minutes...

slaskawi commented 7 years ago

@bolerio Done: https://github.com/bolerio/mjson/pull/29

belaban commented 5 years ago

@slaskawi: your proposed fix seems to be already in jgroups-kubernetes, correct?

slaskawi commented 5 years ago

@belaban Yes!

belaban commented 5 years ago

thx!

programista-zacny commented 4 years ago

@bolerio I can't see anywhere version 1.4.2 with this patch... :( https://mvnrepository.com/artifact/org.sharegov/mjson https://search.maven.org/artifact/org.sharegov/mjson

bolerio commented 4 years ago

@jacakk You are right. It hasn't been published because I've lost my credentials to publish on Maven central and I've been procrastinating sorting that problem out. But I'll do soon and mjson itself is overdue a good update too.

You can of course build from the branch and install locally or on a corporate Maven repository.

Many apologies for the belated reply. This got buried with a pile of github notifications.

perryjp commented 3 years ago

It looks like this is still an issue in master / 2.0? Is there any reason not to apply #29 to master?