dash runs fast for a while, but becomes slow later

baotonglu / dash

Scalable Hashing on Persistent Memory

MIT License

187 stars 26 forks source link

dash runs fast for a while, but becomes slow later #7

Closed amyzx closed 3 years ago

amyzx commented 3 years ago

Hi, I learnt your dash-ex hash tables and have two questions.

Why the number of stash bucket is 2, which is different from the description in your paper.
Run time seems strange.

I did not change the parameters in your run.sh, where the warm-up workload is 10M, and benckmark workload is 190M.

In initial several experiments, dash runs pretty slow with the same shell script. But later, it seems that dash-ex is stuck after loading warm-up 5M data.

I feel confused because I have already run dash-ex about 6 times, where dash-ex works pretty well. I try to rebuild the project, but it is still stuck in loading warm-up phase.

The following is my DCPMM's states.

Could you give me some suggestions about how to figure out the reason? Looking forward to your reply, thanks!

baotonglu commented 3 years ago

Hi,

For (1), our paper also stated that the number of stash buckets is 2. Where is the difference? For (2), possibly your PM pool has been full after running 6 times. So you should first delete the pool and create the new pool for Dash.

Baotong

amyzx commented 3 years ago

Hi, @baotonglu Thanks for your quick reply!

For 1, sorry, I checked your paper again. It was stated in experiment part that you used 2 stash buckets. So you are right. I thought the stash bucket should be 4 because of the description in Section 4.3.

For 2, I am a beginner who just starts learning DCPMM, could you share some links where I can find the command to clean PM pool? In dash, is it possible to add function to clean pool after every experiment? I find you have a function called Close_pool in allocator.h.

I print the pmem_ex.data information by using command pmempool info -s pmem_ex.data. But it seems that it has a large available space. But my program still is stuck during loading phase, this time it can only load 40000 warm-up records.

In addtion, could you explain the meaning of depth_count in struct Directory?

baotonglu commented 3 years ago

Deleting the pool before running the experiment is ok, like this. Depth_count is used for directory halving, which is not fully supported in the current version of Dash. So don't care about it now.

Baotong

amyzx commented 3 years ago

Hi, I have already delete the file using rm command. But it still cannot run normally. I print the idx in Load function, and find that it stops when tries to insert _worklod[31985].

Are there any other possible solutions?

baotonglu commented 3 years ago

"I feel confused because I have already run dash-ex about 6 times, where dash-ex works pretty well." => So it worked well previously?

amyzx commented 3 years ago

Yes, it works well when I run the experiment during the initial times. I run it successfully about the first 6 times, just modifying the pm directory in shell script run.sh and test_pmem.cpp.

baotonglu commented 3 years ago

I guess you should give up your modifications, clone our repo again, build and test it. Without enough information, I cannot help you figure out the reason.

amyzx commented 3 years ago

Ok, I will have a try. Thanks!