Carleslc / Simple-YAML

This Java API provides an easy-to-use way to store data and provide configurations using the YAML format.
https://carleslc.me/Simple-YAML
GNU General Public License v3.0
130 stars 38 forks source link

Memory Leak Problem #80

Closed EverNife closed 6 months ago

EverNife commented 6 months ago

image

I have a plugin to store on a HashMap a PlayerData, for each player there are several informations for him, and for each player there is a separated instance of YamlFile;

//basically a
YamlFile yamlFile = new YamlFile(playersConfigFile);
//
//after some time and changes i just call and 

yamlFile.save();

The problem is, the OVERHEAD of YamlFile is too big. Here follows a headpdump information from what i was able to track:

Heap Dump of individual node of the hashmap ![image](https://github.com/Carleslc/Simple-YAML/assets/11357801/d9c791e9-81d9-4c0b-978c-ac3814a85496) ![image](https://github.com/Carleslc/Simple-YAML/assets/11357801/192cd244-4c1e-4efc-94c9-33f0d62439a5)

So, i was thinking in create a single SimpleYamlImplementation and share it between them. Like this:

    private static final SimpleYamlImplementation simpleYamlImplementation = new SimpleYamlImplementation();
    public static YamlFile createYamlFile(File file){
        YamlFile yamlFile = new YamlFile(simpleYamlImplementation);
        yamlFile.setConfigurationFile(file);

        yamlFile.setCommentFormat(YamlCommentFormat.PRETTY);

        yamlFile.options().quoteStyleDefaults().setDefaultQuoteStyle(QuoteStyle.PLAIN);
        yamlFile.options().quoteStyleDefaults().setQuoteStyle(List.class, QuoteStyle.DOUBLE);
        yamlFile.options().quoteStyleDefaults().setQuoteStyle(String.class, QuoteStyle.DOUBLE);

        simpleYamlImplementation.getDumperOptions().setSplitLines(false);

        return yamlFile;
    }

The problem is that i don't think it can be shared, because there is this call inside the YamlFile constructor.

https://github.com/Carleslc/Simple-YAML/blob/b221dff4f7d084a021170de4f723dd411793bc58/Simple-Yaml/src/main/java/org/simpleyaml/configuration/file/YamlConfiguration.java#L60-L64

What is the correct approach here?

Carleslc commented 6 months ago

snakeyaml implementation requires a considerable amount of memory to store yaml metadata and parse the files correctly, that is not a memory leakage by itself. Also, if you don't have comments in your yml file, ensure you're loading your yamlFile with load and not loadWithComments, to avoid unnecessary overhead of the comment parser.

However, having one yaml implementation for each player seems in fact unnecessary overhead. Why do you have multiple YamlFile instances in the first place? Do you have a different .yml file for each player? If not, you can have a single YamlFile for all your player's file and then create your PlayerData's hashmap (you can pass the instance of YamlFile to the PlayerData constructor if needed). I can understand, though, having a different .yml file for each player for instance to have more control of the save method, so it is more granular and only saves the data needed, without rewriting the entire players file for each change, which is CPU and disk I/O overhead. You should also consider using a database instead of a yml file for this use case. YML files are best used for configuration files, not as large data storage.

In this case, as you considered, sharing the YamlImplementation would be the correct approach, just like you are doing. Why do you think it can't be shared? The setImplementation call is necessary but that does not mean a different yamlImplementation is created. All your YamlFile instances will share the same YamlImplementation instance.

EverNife commented 6 months ago

Sure, i do agree there are other options for data storage, but yml is one of them in my case here. I also save them individually to make things faster.

I asked that here because i tried sharing it between them and somehow a few of the files were corrupted when saving, i will try to understand better what happened, after the corruption i just though it was not the proper way to do, that's why i opened a issue to ask for it.

I will make some more tests and i will give a feedback later.

EverNife commented 6 months ago

Ok, tested several ways, chaging the entire Implementation or just parts of it.

There is no safe way to share the implementation between several YamlFile. It will always corrupt in some manner. (at least on my concurrent application)

The proper thing to hadle this is just not create thons of YamlFile and just create and keep loaded the ones being used :/