pmem / issues

Old issues repo for PMDK.
http://pmem.io
13 stars 7 forks source link

pmemobj_tx_free() crashed when the pool is full #1059

Closed NiuYawei closed 5 years ago

NiuYawei commented 5 years ago

ISSUE: <pmemobj_tx_free() crashed when the pool is full>

Environment Information

Please provide a reproduction of the bug:

  1. Create a small pmemobj pool (20MB);
  2. Start PMDK transaction;
  3. Repeat 4k allocations (pmemobj_tx_alloc()), until run out of space;
  4. Free the last successfully allocate 4k (pmemobj_tx_free()); -----> crash triggered;
  5. Commit PMDK transaction;
  6. Close pmemobj file;

The code to reproduce the defect:

int main(int argc, char *argv) { char pool_file = "rep.pool"; uint64_t pool_size = (20 << 20); / 20MB / PMEMobjpool *umm_pool; PMEMoid oid, prev_oid; uint64_t tot_allocated = 0, alloc_size = 4096; int rc;

    unlink(pool_file);

    umm_pool = pmemobj_create(pool_file, "reproducer",
                              pool_size, 0666);
    if (umm_pool == NULL) {
            fprintf(stderr, "create pmemobj pool error\n");
            return -1;
    }

    rc = pmemobj_tx_begin(umm_pool, NULL, TX_PARAM_NONE);
    if (rc) {
            fprintf(stderr, "start tx failed, %d\n", rc);
            return -1;
    }

    oid = pmemobj_tx_alloc(alloc_size, 0);
    if (oid.off == 0) {
            fprintf(stderr, "alloc failed\n");
            return -1;
    }
    tot_allocated += alloc_size;
    prev_oid = oid;

    fprintf(stdout, "fill the pmemobj pool...\n");
    while (rc == 0) {
            oid = pmemobj_tx_alloc(alloc_size, 0);
            if (oid.off == 0) {
                    fprintf(stdout, "pool is full\n");
                    rc = -1;
            } else {
                    tot_allocated += alloc_size;
                    prev_oid = oid;
            }
    }
    fprintf(stdout, "fill done %lu\n", tot_allocated);

    fprintf(stdout, "free allocated in same transaction\n");
    rc = pmemobj_tx_free(prev_oid);
    if (rc) {
            fprintf(stderr, "free failed\n");

    pmemobj_tx_commit();
    pmemobj_close(umm_pool);
    return 0;

}

How often bug is revealed: (always, often, rare):

Actual behavior:

Expected behavior:

Details

Additional information about Priority and Help Requested:

Are you willing to submit a pull request with a proposed change? (Yes, No)

Requested priority: (Showstopper, High, Medium, Low)

kilobyte commented 5 years ago

Sys info is missing from GitHub's web view, text mail says RHEL7.

Reproducible on Debian Buster, with PMDK 1.5.1 + ndctl 63, and PMDK 1.6 + ndctl 65.

pbalcer commented 5 years ago

I think what happens is pmemobj_tx_free() is being called in during TX_STAGE_ONABORT and an ASSERT is getting triggered.

Remember that unsuccessful pmemobj_tx_alloc() will cause an abort.

NiuYawei commented 5 years ago

Yes, I think you are right. I realized this could be the cause as well.

For our use case, we don't use the TX macros and the longjmp is disabled, we just explicitly call pmemobj_tx_begin() and pmemobj_tx_commit/abort() and do error cleanup by ourselves, also the same code is shared by transactional & non-transactional interfaces, that's why the pmemobj_tx_free() could be called on error cleanup code path even if the transaction is already aborted by previous failed tx_alloc() call.

I was wondering if it's better to skip such implicit tx_abort() call in PMDK for above use case?

Anyway, I think I can fix our code to solve the problem, thank you for your help and please feel free to close this issue.

pbalcer commented 5 years ago

For some time we actually wanted to add a feature that would skip the abort in transaction operations. I'll add a feature request and add it to the backlog, until then I'll leave the bug open.

pbalcer commented 5 years ago

Took me a while, but I created a new feature request: #1066. If you have any questions or suggestions, please feel free to state them in the new issue.