pmem / pmemkv

Key/Value Datastore for Persistent Memory
https://pmem.io
Other
399 stars 118 forks source link

Why the value address pass to pmemkv_get_v_callback function is not on pmem?(Tree3 engine) #1070

Closed yijieZ closed 2 years ago

yijieZ commented 2 years ago

QUESTION:

In tree3::get function, using pmem_is_pmem() to test the value address but returns 0.

Details

Four real optane pm devices are interleaving and added to OS in 'fsdax' mode. I mount the pm device with ext4-dax filesystem.

mkfs.ext4 /dev/pmem0
mount -o dax /dev/pmem0 /mnt/pmem0

The PMDK version is 1.9.1.

I add pmem_is_pmem() to the tree3::get function in pmemkv/src/engines-experimental/tree3.c, and found that pmem_is_pmem() returns 0.

status tree3::get(string_view key, get_v_callback *callback, void *arg)
{
    LOG("get using callback for key=" << std::string(key.data(), key.size()));
    check_outside_tx();
    // XXX - do not create temporary string
    auto leafnode = LeafSearch(std::string(key.data(), key.size()));
    if (leafnode) {
        const uint8_t hash = PearsonHash(key.data(), key.size());
        for (int slot = LEAF_KEYS; slot--;) {
            if (leafnode->hashes[slot] == hash) {
                LOG("   found hash match, slot=" << slot);
                if (leafnode->keys[slot].compare(
                        std::string(key.data(), key.size())) == 0) {
                    auto kv = leafnode->leaf->slots[slot].get_ro();

                                        printf("tree3 get is_pmem=%d\n", pmem_is_pmem(kv.val(), kv.valsize()));

                    LOG("   found value, slot="
                        << slot
                        << ", size=" << std::to_string(kv.valsize()));
                    callback(kv.val(), kv.valsize(), arg);
                    return status::OK;
                }
            }
        }
    }
    LOG("   could not find key");
    return status::NOT_FOUND;
}

The leafnode->leaf->slots[slot] is allocated in tree3::put function by calling make_persistent<internal::tree3::KVLeaf>() which means it is allocated from pm. It is confused that the pmem_is_pmem() return 0.

The explain of pmemkv_get function in pmemkv manpage says that the Value points to the location where data is actually stored (no copy occurs). Does it means kv.val() is a address to pm?

igchor commented 2 years ago

From the manpage of pmem_is_pmem: "Calling this function with a memory range that originates from a source different than pmem_map_file() is undefined."

Pmemkv uses pmemobj to manage a persistent memory pool, and pmemobj does not use pmem_map_file but some dedicated function. I don't think there is any function like pmem_is_pmem for pmemobj. But if you'd share some more details on what you are trying to do we might come up with some suggestions.

yijieZ commented 2 years ago

Sorry for my question, I should read doc and code before ask. I insert KV pairs(1,000,000 and value size=128B) to tree3 and search them with hotness distribute(like workloadc in ycsb). I find the data size read from PMs(by using pcm-memory tool) is far more less than I requested(10,000,000 get). There must be a buffer for hot data and I am looking for it. But I think is not pmemkv, sorry for my question again.

igchor commented 2 years ago

No problem :)

Are you actually using ycsb in your experiments? No sure if you know but we have made integration with it (not yet upstreamed): https://github.com/pmem/ycsb

yijieZ commented 2 years ago

I use YCSB to generate trace file which provides operation type and key and modify pmemkv/examples/pmemkv_basic_c/pmemkv_basic.c to load the trace file. It would be great to have pmemkv in YCSB!

lukaszstolarczuk commented 2 years ago

If you want, you can bump our PR on their repo: https://github.com/brianfrankcooper/YCSB/pull/1545

perhaps it will speed up the wait there :-)

lukaszstolarczuk commented 2 years ago

I hope it helped. In case of more questions don't hesitate to ask. Closing this one for now.