maxmind / mmdbwriter

Go library for writing MaxMind DB (mmdb) files
Apache License 2.0
116 stars 28 forks source link

MMDB writer consuming a lot of memory #90

Open mandar-01 opened 2 months ago

mandar-01 commented 2 months ago

Hi,

I have been using the mmdbwriter package in golang to insert some records into an MMDB file. I observed that the script to insert records into MMDB consumes a lot of memory. I did some memory profiling using pprof and I found that a couple of MMDB writer functions were consuming most of the memory. I have attached a screenshot below for reference.

Screenshot 2024-07-22 at 6 15 53 PM

Screenshot 2024-07-22 at 6 17 56 PM

Screenshot 2024-07-22 at 6 18 37 PM

These functions consume around 600MB each whereas the size of the final MMDB file was 146MB. Overall, the program consumed around 3.8GB for producing a file of 146MB. I think these functions, specially the Map.Copy(), stores the records in-memory and is not garbage collected since their references are still in use. This profiling is done just before the writer writes the MMDB file on disk.

Here's how I have defined the MMDB writer:

writer, err := mmdbwriter.New(
    mmdbwriter.Options{
        DatabaseType:            "V1",
        IncludeReservedNetworks: true,
        RecordSize:              32,
    },
)

I am using the DeepMergeWith inserter to insert the MMDB records

oschwald commented 2 months ago

It is expected that the writer will use a fair bit of memory. You don't provide any information on how you are using the writer, but in terms of what you have provided:

mandar-01 commented 2 months ago

Thanks. Yes you are right, I am using the DeepMergeWith inserter. Updated the comment and added details about how I have defined the writer.

oschwald commented 2 months ago

Looking at the code, I think it would be possible to get rid of the Copy in DeepMergeWith and more carefully allocate a new map only when needed. It is hard to know if this would significantly impact your memory usage as it would largely depend on the structure of your data and how it is modified on insert.

We don't have a single internal use of DeepMergeWith. I don't know if this is a change that we are likely to work on given that.