estraier / tkrzw

a set of implementations of DBM
Apache License 2.0
164 stars 20 forks source link

Something wrong with SkipDBM №2: low memory segfault #13

Closed tieugene closed 3 years ago

tieugene commented 3 years ago

Steps to reproduce:

  1. Get testing linux host (3795MB RAM, ~36550MB free, swap off, Fedora 33 (x64), tkrzw-0.9.7..8)
  2. Start testing app for TreeDBM: kvtest_tk -f test.tks -v 27:
  3. Add ... 134217728 @ 264753 ms (506 Kops) Sync... 159922 ms
  4. Get ... terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits, std::allocator >'
    
    This means:
    - Created 2^27 records of 24-byte key -> 2-byte value; DB file size is 4139MB
    - Synced to disk
    - Try to find random _known_ keys
  5. Get segfault

PS. HashDBM works ok with these conditions, but very slowly.

estraier commented 3 years ago

Considering that the tree database is not so good at random access, storing 134 million records in 264 seconds seems too fast. And, I'd like to know the file size of the database.

BTW, your code doesn't tune the database. For HashDBM, setting the number of bucket to be the same (or twice) as the number of records is recommended.

estraier commented 3 years ago

On my environment with 16GB RAM, the result is like this:

$ ./tk_main -f casket.tkt -v 27 Playing 134217728 records:

  1. Add ... 134217728/134217728 @ 8141194 ms (16 Kops) Sync ... 790 ms
  2. Get ... 185301/185301 @ 5 s (37 Kops)
  3. Ask ... 75226/150453 @ 5 s (30 Kops)
  4. Try ... 114135/114135 @ 5 s (22 Kops): 57067 get, 57068 add Sync ... 1307 ms n = 27, t = 8141 s, size = 5202 MB, Kops: 16 37 30 22

So, the file exceeds the virtual memory capacity, which causes memory exhaustion. In order to avoid this, using another file implementation which doesn't use memory mapping. I released a new version where PolyDBM also features switching the file implementation.

PolyDBM dbm; const std::map<std::string, std::string> tuning_params = { {"file", "PositionalParallelFile"}, ...(other tuning params)... }; dbm.OpenAdvanced(path, file_options, tuning_params);

As for concrete tuning parameters, please see this section: https://dbmx.net/tkrzw/#tips_treedbm_tune

tieugene commented 3 years ago

Considering that the tree database is not so good at random access, storing 134 million records in 264 seconds seems too fast.

It is real case - working with BTC blockchain. Over 800M records.

And, I'd like to know the file size of the database.

Fresh version shows it at the end (in stdout).

BTW, your code doesn't tune the database. For HashDBM, setting the number of bucket to be the same (or twice) as the number of records is recommended.

I know. This is the feature - to compare k-v storages without any tuning. Another reason - this not helps with DB size near RAM size.

tieugene commented 3 years ago

Considering that the tree database is not so good at random access, storing 134 million records in 264 seconds seems too fast.

Pay attention: problem is not in slow work with big DB (all k-v storages work similar). Problem is in segfault. Exactly with SkipDBM. HashDBM end others work ok.

tieugene commented 3 years ago

On my environment with 16GB RAM, the result is like this:

$ ./tk_main -f casket.tkt -v 27

Sorry I forgot to show DB type in example. Real options: kvtest_tk -v -f test.tks -v 27. ThreeDBM DB type.

estraier commented 3 years ago

I build the command by calling g++ directly so the command name is different. So, please ignore the name. Anyway, the command works on my machine with 16GB RAM. I'm not sure these figures are as expected though.

$ ./tk_main -f casket.tks -v 27 Playing 134217728 records:

  1. Add ... 134217728/134217728 @ 325215 ms (412 Kops) Sync ... 75030 ms
  2. Get ... 823463/836436 @ 5 s (167 Kops)
  3. Ask ... 423125/859357 @ 5 s (171 Kops)
  4. Try ... 718097/1458736 @ 5 s (291 Kops): 718019 get, 78 add Sync ... 55715 ms n = 27, t = 325 s, size = 4139 MB, Kops: 412 167 171 291
tieugene commented 3 years ago

Anyway, the command works on my machine with 16GB RAM.

Switch swap off. Available memory (all virtual memory) must be < db size.

estraier commented 3 years ago

I committed another CL, by which memory usage is reduced if you specify {"file", "PositionalParallelFile"} to the tuning parameters. I agree on your intention to avoid ad-hoc tuning for each DB implementation. However, could you try the tuning even once? The limitation that the DB size must be less than the virtual memory capacity is on the specification. I can do nothing for the usage which violates the assumption.

tieugene commented 3 years ago

However, could you try the tuning even once?

I did it. Yes, this has good effect. Until DB < physical RAM.

The limitation that the DB size must be less than the virtual memory capacity is on the specification. I can do nothing for the usage which violates the assumption.

But segfault is not good Status code.

estraier commented 3 years ago

I did it. Yes, this has good effect. Until DB < physical RAM. By the latest commit, the DB size can exceed the physical RAM size if you specify the tuning parameter. Even so, do you still get segfault?

But segfault is not good Status code. I agree. However, detecting memory exhaustion is a hard task. When it happens is totally unexpected. Checking it for each line is not practical.

tieugene commented 3 years ago

Even so, do you still get segfault?

Yes:

Playing 2**27 = 134217728 records:
1. Add ... 134217728 @ 260057 ms (516 Kops)
   Sync... 180456 ms
2. Get ... terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >'

Tkrzw-0.9.9, RAM 4GB, swap off, without any tuning. As for 2**26 (64M records, DB size 2069MB) - works ok, but speed dramatically fell.

estraier commented 3 years ago

Without tuning, memory exhaustion is inevitable. I consider it feature's limitation. If tune it, do you get segfault?

tieugene commented 3 years ago

If tune it, do you get segfault?

Setting "offset_width=5" helps - none of errors. Thank you.