jbeder / yaml-cpp

A YAML parser and emitter in C++
MIT License
5.09k stars 1.83k forks source link

Too much memory (involving pagefile.sys) being used in parsing 680Mbytes yaml file #307

Open ShigeruHEMMI opened 9 years ago

ShigeruHEMMI commented 9 years ago

I encountered a problem about memory usage (involving pagefile.sys). I believe it becomes extremely high when parsing a yaml file.

The size of my yaml file is 679Mbytes and has 7,752,312 lines. The parsing time of this yaml file is some 20 minutes.

Parse time is not a significant problem for me, but my annoying problem is that the large memory usage in parsing.

My PC is a notebook type but has the physical memory 32 Gbytes and cpu is CORE i7. The OS of PS is Windows 8.1 Pro.

According to my experience, it seems that the memory (real memory+pagefile.sys) usage is some 60 Gbytes in parsing the file.

Comparing 60 Gbytes and the file size(679Mbytes), I believe the difference is extreme, and am guessing has room to improve it.

My application is a scientific/engineering one and am developing an FE (finite element) code; almost all lines of my input file are made of FE-nodes and FE-connections. Very small portion of my yaml file looks something like this:

nodes: 1: [ -1.8750000033258e+02, 1.8750000033258e+02, -1.8750000000000e+02] 2: [ -1.8669410520834e+02, 1.8615698905822e+02, -1.8685724966934e+02] connections: 1: [310, 2, 2211512, 2161371, 2174435, 2175413, 2186923, 2167974, 2193347, 2193383, 2168430, 2174804] 2: [310, 2, 2211513, 2161372, 2170787, 2133582, 2186924, 2166111, 2191349, 2172301, 2147113, 2151851]

Please consider reducing memory usage of yaml-cpp.

Best regards,

jbeder commented 9 years ago

Thanks for the report. Can you attach the file in question? If it's too large, can you cut it down and attach a file that still manifests this behavior?

ShigeruHEMMI commented 9 years ago

Thanks for the reply. Let me know how I can attach the 7z compressed file (currently it is 96,900,825bytes). I am unfamiliar github.com. Seems that images can upload up to 10 Mbytes.

P.S. What I wrote 60Gbytes was overestimated; it was my mistake, sorry. Currently I am guessing total memory usage is 32Gbytes(Physical memory) + 6.5Gbytes(pagefile.sys).

jbeder commented 9 years ago

I don't know; I'm unfamiliar with github too. Can you reduce its size to a point where it can be uploaded, but still uses too much memory?

ShigeruHEMMI commented 9 years ago

Could you try to download the file(SphereNNN_3375_andOthers.7z*) from the URL: https://drive.google.com/file/d/0B_0Xe5BLJQt8LWJXV1k5bGJ1UUk/view?usp=sharing

Regards,