Closed khachaturh closed 4 years ago
I've been trying to replicate this, but was unsuccessful so far. So far, I've tried running the client's batch_get
test in a loop, and observing memory allocations after every X iterations:
$ cargo test --release --test lib batch -- --nocapture -Zunstable-options --report-time
Compiling aerospike v0.5.0 (/Users/jhecking/aerospike/aerospike-client-rust)
Finished release [optimized] target(s) in 4.72s
Running target/release/deps/lib-234f38fdc32b270d
running 1 test
0 iterations: 8215696 bytes allocated / 13152256 bytes resident
10000 iterations: 9496168 bytes allocated / 19107840 bytes resident
20000 iterations: 9601664 bytes allocated / 18993152 bytes resident
30000 iterations: 11312288 bytes allocated / 19124224 bytes resident
40000 iterations: 11322856 bytes allocated / 18956288 bytes resident
50000 iterations: 9536992 bytes allocated / 19046400 bytes resident
60000 iterations: 9628136 bytes allocated / 19099648 bytes resident
test src::batch::batch_get ... test src::batch::batch_get has been running for over 60 seconds
70000 iterations: 11179760 bytes allocated / 19070976 bytes resident
80000 iterations: 11339816 bytes allocated / 19103744 bytes resident
90000 iterations: 11374760 bytes allocated / 19075072 bytes resident
100000 iterations: 9530040 bytes allocated / 18935808 bytes resident
110000 iterations: 9599800 bytes allocated / 19161088 bytes resident
120000 iterations: 10807808 bytes allocated / 19156992 bytes resident
130000 iterations: 10472288 bytes allocated / 18882560 bytes resident
140000 iterations: 8614328 bytes allocated / 18944000 bytes resident
150000 iterations: 8367832 bytes allocated / 18849792 bytes resident
160000 iterations: 9397392 bytes allocated / 18989056 bytes resident
170000 iterations: 10029000 bytes allocated / 18964480 bytes resident
180000 iterations: 10152560 bytes allocated / 18952192 bytes resident
190000 iterations: 8348952 bytes allocated / 19030016 bytes resident
200000 iterations: 8355600 bytes allocated / 19034112 bytes resident
test src::batch::batch_get ... ok <195.465s>
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 17 filtered out
Memory allocations go up and down cyclically but don't seem to grow out of bounds. I'm using the jemallocator
crate to measure allocations. For now I've only tested this in macOS, not Linux.
Can you tell me a bit more about how you are using the client's batch_get
function? Maybe some sample code?
I see batch size is 4 on your test. https://github.com/aerospike/aerospike-client-rust/blob/master/tests/src/batch.rs
Leaks are happened when batch size is a big. For example, if size is less than 100, when leaks aren't visible. But let's try to get_batch with 1000 batch elements and memory will grow aggressive. Memory isn't freed even after client.close().
Another minor issue about close. Client isn't call close when is going out from scope. So the thread and connection pool stay opened. May be needs to implement Drop trait.
Ok, I will try to reproduce the issue with larger batch sizes. How many nodes are in your cluster?
Another minor issue about close. Client isn't call close when is going out from scope. So the thread and connection pool stay opened. May be needs to implement Drop trait.
That's definitely a good suggestion. Feel free to file a separate issue for that.
Ok, I will try to reproduce the issue with larger batch sizes. How many nodes are in your cluster?
5 nodes with replication factor 2
So, my previous post was correct but misleading and so I deleted it. The truncate method on vec does not actually release memory and so while it looks like it should shrink according to the code, the actual allocation does not. Since the buffer used to write the request etc is attached to a connection and thus its lifetime is the same and currently only ever grows to fit the maximum size, the buffer will eventually get huge if you push large requests through a connection. If this happens for every connection in a pool, after some time the allocation will balloon dramatically. The resizing of the buffer actually needs to drop the memory, which means calling any of the shrink_to methods on the vector.
I submitted a pull request that does this. The threshold value is something you might want to make configurable or change.
Thank you @soro! Will take a look at your PR and revert by the end of the week.
Resolved in #83.
The Client::batch_get function seems to make a lot memory leaks. After thousands call the process memory grow up to 10 GB. The same requests with simply Client::get looks good. The target is x86_64-unknown-linux-gnu