aerospike / aerospike-client-c

Aerospike C Client
Other
98 stars 103 forks source link

as_val_reserve not reference counting correctly #83

Closed khivi closed 6 years ago

khivi commented 6 years ago
  as_bytes *byte_array = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  as_record_destroy(rec);
 // byte_array

Calling above 2 times on same big returns the same value for the byte_array pointer.

From the documentation, I assumed it was ref counted and a different as_bytes will be available. Without the as_record_destroy it works fine.

BrianNichols commented 6 years ago

You are describing versioning instead of reference counting. as_val_reserve() does not create a new version of the bin. as_val_reserve() simply increments the reference count to the same bin.

khivi commented 6 years ago

Sorry I was not clear.

 as_bytes *byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec1, bin));
 as_record_destroy(rec1);
 as_bytes *byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec2, bin));
 as_record_destroy(rec2);

// Now I see byte_array1 is equal to byte_array2

My understanding was as_val_reserve would increment the reference count and so the byte_array1 is available after rec1 is destroyed. But I see that reading a different record, ends up returns the same so byte_array1 is equal to byte_array2.

If I comment out the as_record_destroy then correctly they are no longer equal.

BrianNichols commented 6 years ago

Your observations are not correct. Run this code and byte_array1 will not be equal byte_array2.

const char* bin = "bin";

as_bytes b1;
as_bytes_init_wrap(&b1, (uint8_t*)"11111", 5, false);

as_record r1;
as_record_inita(&r1, 1);
as_record_set_bytes(&r1, bin, &b1);

as_bytes b2;
as_bytes_init_wrap(&b2, (uint8_t*)"22222", 5, false);

as_record r2;
as_record_inita(&r2, 1);
as_record_set_bytes(&r2, bin, &b2);

as_record* rec1 = &r1;
as_record* rec2 = &r2;

as_bytes *byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec1, bin));
printf("%s\n", byte_array1->value);
as_record_destroy(rec1);

as_bytes *byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec2, bin));
printf("%s\n", byte_array2->value);
as_record_destroy(rec2);

printf("%s\n", byte_array1->value);
printf("%s\n", byte_array2->value);

as_bytes_destroy(byte_array1);
as_bytes_destroy(byte_array2);

Output: 11111 22222 11111 22222

Are you retrieving the record from async routines? If so, you shouldn't be calling as_record_destroy() in the listener because the client does that for you when the listener completes.

khivi commented 6 years ago

  const char* bin = "bin";
  const char* set = "set";
  const char* ns = "ns";
  as_error  err;

  aerospike  as = reader->as_; //Change to your connection
  as_key key1;
  as_key_init(&key1, ns, set, "key1");
  as_bytes b1;
  as_bytes_init_wrap(&b1, (uint8_t*)"11111", 5, false);
  as_record rec1;
  as_record_init(&rec1, 1);
  as_record_set_bytes(&rec1, bin, &b1 );
  aerospike_key_put(&as, &err, NULL, &key1, &rec1);

  as_key key2;
  as_key_init(&key2, ns, set, "key2");
  as_bytes b2;
  as_bytes_init_wrap(&b2, (uint8_t*)"22222", 5, false);
  as_record rec2;
  as_record_init(&rec2, 1);
  as_record_set_bytes(&rec2, bin, &b2 );
  aerospike_key_put(&as, &err, NULL, &key2, &rec2);

  as_record* rec;
  aerospike_key_get(&as, &err, NULL, &key1, &rec);
  as_bytes *byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  printf("%s\n", byte_array1->value);
  as_record_destroy(rec);

  aerospike_key_get(&as, &err, NULL, &key2, &rec);
  as_bytes *byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  printf("%s\n", byte_array2->value);
  as_record_destroy(rec);

  printf("%s\n", byte_array1->value);
  printf("%s\n", byte_array2->value);

  as_bytes_destroy(byte_array1);
  as_bytes_destroy(byte_array2);

illustrates the issue.

BrianNichols commented 6 years ago

Your rec pointer is not initialized. If the rec pointer is NULL, aerospike_key_get() will create and initialize the rec. If the rec pointer is not NULL, aerospike_key_get() assumes the rec instance is valid and reuses the rec. This will corrupt memory if rec is invalid.

Change your program to set rec to NULL before calling aerospike_key_get().

  as_record* rec = NULL;
  aerospike_key_get(&as, &err, NULL, &key1, &rec);
  as_bytes *byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  printf("%s\n", byte_array1->value);
  as_record_destroy(rec);

  rec = NULL;
  aerospike_key_get(&as, &err, NULL, &key2, &rec);
  as_bytes *byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  printf("%s\n", byte_array2->value);
  as_record_destroy(rec);

You should also be checking the aerospike_key_get() return code.

khivi commented 6 years ago

Setting rec to NULL worked.

khivi commented 6 years ago

I had missed as_record_destory when I said previously. I still see same error with setting rec == NULL.

From what I see --

The parameter to as_val_reserve is an as_bytes. This is the address of as_bin_value in as_bin (call that X)

as_record_destroy -> as_val_val_destory -> calls hooks ---> as_record_release

For the next aerospike_key_get

Since as_bin_value_s is a union so &Y[0].value == X so we will operate on the same pointer X even though we had ref counted it earlier

BrianNichols commented 6 years ago

Yes, it turns out that as_record_destroy() also destroys the inline bin values which is needed by byte_array1/byte_array2. The byte pointer and length need to be saved before calling as_record_destroy(). I have created a complete example that has been verified with valgrind.

#include <stddef.h>
#include <stdlib.h>

#include <aerospike/aerospike.h>
#include <aerospike/aerospike_key.h>
#include <aerospike/as_error.h>
#include <aerospike/as_record.h>
#include <aerospike/as_status.h>

#include "example_utils.h"

static void
print_bytes(uint8_t* b, uint32_t len)
{
    for (uint32_t i = 0; i < len; i++)
    {
        printf("%c", (char)b[i]);
    }
    printf("\n");
}

int
main(int argc, char* argv[])
{
    if (! example_get_opts(argc, argv, EXAMPLE_BASIC_OPTS)) {
        exit(-1);
    }

    aerospike as;
    example_connect_to_aerospike(&as);

    const char* bin = "bin";
    const char* set = "set";
    const char* ns = "test";
    as_error  err;

    as_key key1;
    as_key_init(&key1, ns, set, "key1");
    as_bytes b1;
    as_bytes_init_wrap(&b1, (uint8_t*)"11111", 5, false);
    as_record rec1;
    as_record_init(&rec1, 1);
    as_record_set_bytes(&rec1, bin, &b1 );

    if (aerospike_key_put(&as, &err, NULL, &key1, &rec1) != AEROSPIKE_OK) { 
        printf("aerospike_key_put() returned %d - %s\n", err.code, err.message);
        exit(-1);
    }
    as_record_destroy(&rec1);

    as_key key2;
    as_key_init(&key2, ns, set, "key2");
    as_bytes b2;
    as_bytes_init_wrap(&b2, (uint8_t*)"22222", 5, false);
    as_record rec2;
    as_record_init(&rec2, 1);
    as_record_set_bytes(&rec2, bin, &b2 );

    if (aerospike_key_put(&as, &err, NULL, &key2, &rec2) != AEROSPIKE_OK) { 
        printf("aerospike_key_put() returned %d - %s\n", err.code, err.message);
        exit(-1);
    }
    as_record_destroy(&rec2);

    as_record* rec = NULL;

    if (aerospike_key_get(&as, &err, NULL, &key1, &rec) != AEROSPIKE_OK) {
        printf("aerospike_key_get() returned %d - %s\n", err.code, err.message);
        exit(-1);
    }

    as_bytes* byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
    print_bytes(byte_array1->value, byte_array1->size);
    uint8_t* buf1 = byte_array1->value;
    uint32_t len1 = byte_array1->size;
    as_record_destroy(rec);

    rec = NULL;

    if (aerospike_key_get(&as, &err, NULL, &key2, &rec) != AEROSPIKE_OK) {
        printf("aerospike_key_get() returned %d - %s\n", err.code, err.message);
        exit(-1);
    }

    as_bytes* byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
    print_bytes(byte_array2->value, byte_array2->size);
    uint8_t* buf2 = byte_array2->value;
    uint32_t len2 = byte_array2->size;
    as_record_destroy(rec);

    print_bytes(buf1, len1);
    print_bytes(buf2, len2);

    free(buf1); 
    free(buf2); 

    example_cleanup(&as);

    return 0;
}
khivi commented 6 years ago

That will work. What is the process to have someone correct the documentation here https://www.aerospike.com/docs/client/c/best_practices/records.html#getting-bins ?

BrianNichols commented 6 years ago

That document will be fixed in the next C client release.

kkbachu commented 5 years ago

Yes, it turns out that as_record_destroy() also destroys the inline bin values which is needed by byte_array1/byte_array2. The byte pointer and length need to be saved before calling as_record_destroy(). I have created a complete example that has been verified with valgrind.

#include <stddef.h>
#include <stdlib.h>

#include <aerospike/aerospike.h>
#include <aerospike/aerospike_key.h>
#include <aerospike/as_error.h>
#include <aerospike/as_record.h>
#include <aerospike/as_status.h>

#include "example_utils.h"

static void
print_bytes(uint8_t* b, uint32_t len)
{
  for (uint32_t i = 0; i < len; i++)
  {
      printf("%c", (char)b[i]);
  }
  printf("\n");
}

int
main(int argc, char* argv[])
{
  if (! example_get_opts(argc, argv, EXAMPLE_BASIC_OPTS)) {
      exit(-1);
  }

  aerospike as;
  example_connect_to_aerospike(&as);

  const char* bin = "bin";
  const char* set = "set";
  const char* ns = "test";
  as_error  err;

  as_key key1;
  as_key_init(&key1, ns, set, "key1");
  as_bytes b1;
  as_bytes_init_wrap(&b1, (uint8_t*)"11111", 5, false);
  as_record rec1;
  as_record_init(&rec1, 1);
  as_record_set_bytes(&rec1, bin, &b1 );

  if (aerospike_key_put(&as, &err, NULL, &key1, &rec1) != AEROSPIKE_OK) { 
      printf("aerospike_key_put() returned %d - %s\n", err.code, err.message);
      exit(-1);
  }
  as_record_destroy(&rec1);

  as_key key2;
  as_key_init(&key2, ns, set, "key2");
  as_bytes b2;
  as_bytes_init_wrap(&b2, (uint8_t*)"22222", 5, false);
  as_record rec2;
  as_record_init(&rec2, 1);
  as_record_set_bytes(&rec2, bin, &b2 );

  if (aerospike_key_put(&as, &err, NULL, &key2, &rec2) != AEROSPIKE_OK) { 
      printf("aerospike_key_put() returned %d - %s\n", err.code, err.message);
      exit(-1);
  }
  as_record_destroy(&rec2);

  as_record* rec = NULL;

  if (aerospike_key_get(&as, &err, NULL, &key1, &rec) != AEROSPIKE_OK) {
      printf("aerospike_key_get() returned %d - %s\n", err.code, err.message);
      exit(-1);
  }

  as_bytes* byte_array1 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  print_bytes(byte_array1->value, byte_array1->size);
  uint8_t* buf1 = byte_array1->value;
  uint32_t len1 = byte_array1->size;
  as_record_destroy(rec);

  rec = NULL;

  if (aerospike_key_get(&as, &err, NULL, &key2, &rec) != AEROSPIKE_OK) {
      printf("aerospike_key_get() returned %d - %s\n", err.code, err.message);
      exit(-1);
  }

  as_bytes* byte_array2 = (as_bytes*)as_val_reserve(as_record_get_bytes(rec, bin));
  print_bytes(byte_array2->value, byte_array2->size);
  uint8_t* buf2 = byte_array2->value;
  uint32_t len2 = byte_array2->size;
  as_record_destroy(rec);

  print_bytes(buf1, len1);
  print_bytes(buf2, len2);

  free(buf1); 
  free(buf2); 

  example_cleanup(&as);

  return 0;
}

Although the issue is closed, I am seeing similar issue with async get calls where get is done, and I am saving the value with as_get_bytes and later on I destroy the value with as_val_destroy, it crashes in as_string_val_destroy. instaed of free(buf1) and free(buf2) --- shouldn't we be calling as_val_destroy(buf1). If so, its crashing. Any ideas?

BrianNichols commented 5 years ago

In this specific case (reserving byte array bin value for use after as_record_destroy()), calling free(buf) is the correct approach. The steps are:

1) as_val_reserve() is called on the as_bytes value. This increments the reference count on as_bytes.

2) The byte array pointer (as_bytes.value) and length (as_bytes.size) are saved for later use.

3) as_record_destroy() is called. This first decrements the reference count on as_bytes, but does not free "as_bytes.value" due to step #1. Then, the record and bin entries (which are allocated by a single malloc()/alloca()) are freed. The is means the bin entry's pointer to the byte array can no longer be used to access the byte array. Hence, the need to save this info in step #2.

4) At this point, the as_bytes struct has been freed leaving only the underlying byte array itself. That byte array should be destroyed with a simple free() call.

The async model only differs in that as_record_destroy() is automatically called after the aerospike_key_get_async() listener is called. This user should not explicitly call as_record_destroy() in the async model.

kkbachu commented 5 years ago

Thank you for the clarification.