Snaipe / libcsptr

Smart pointers for the (GNU) C programming language
https://snai.pe/c/c-smart-pointers/
MIT License
1.57k stars 143 forks source link

Suggestion on smart arrays #7

Open alexreg opened 9 years ago

alexreg commented 9 years ago

Firstly, I must commend you on a very nice library here. Creating a library for smarter pointers in C and doing it in a way that is compatible with the syntax and style of the language is a real step forward for people like me who want to try high-level programming in C. Keep up the good work, and I'd be curious to hear if you have any plans for further features/updates. (I like that it's a small library right now though, and quite minimalistic as is C as a whole.)

Regarding "smart arrays" specifically: I like that the library can infer the size of fixed-size arrays and use this to automate the destructor more. However, I personally imagine that implementing this in another way would be more general and extensible:

  1. For any fixed-size array type, set the default metadata to a size_t containing that size, and also set the default destructor to a function that does sfree on each element of the array (if the element type is a pointer) and nothing (if the element type is a non-pointer).
  2. Allow the metadata and destructor for the array to be override the defaults when they are specified as normal in the unique_ptr or shared_ptr macro call.

Anyway, let me know what you think, and maybe we can establish what sort of interface for smart arrays is best. The design of the library in general is great, of course. :)

Snaipe commented 9 years ago

Thanks for the suggestions :)

I'd be curious to hear if you have any plans for further features/updates. (I like that it's a small library right now though, and quite minimalistic as is C as a whole.)

Once I'm done with other things, I will most probably add some kind of opaque weak pointer if I find a proper way without any race condition.

  1. For any fixed-size array type, set the default metadata to a size_t containing that size, and also set the default destructor to a function that does sfree on each element of the array (if the element type is a pointer) and nothing (if the element type is a non-pointer).

Will try. I'm not sure if gcc will let me make a conditional branching for pointer/non-pointer types though. Also, for the metadata part, I'm not quite sure what you mean, as I think I already do this -- the actual metadata is prepended with the length of the array and and the size of the base type, and both are accessible with functions in csptr/array.h

  1. Allow the metadata and destructor for the array to be override the defaults when they are specified as normal in the unique_ptr or shared_ptr macro call.

It should be possible, but then again, I wouldn't know if the passed type is specifically a pointer, or even a pointer managed by libcsptr. I could specify some kind of universal destructor for arrays that you could pass by default though.

What I could do, maybe, is to stringify the type, then try to parse the string to see if it is a pointer type -- which is in my opinion a bit overkill; but that still wouldn't solve the problem of knowing if a pointer is a smart pointer.

So now, it's more of a preference between 1)

// void sfree2(void *ptr, void *meta);
unique_ptr(type*[x], .dtor = sfree2); // array contents are smart pointers
unique_ptr(type*[x]); // array contents are plain pointers

and 2)

unique_ptr(type*[x]); // array contents are smart pointers
unique_ptr(type*[x], .dtor = NULL); // array contents are plain pointers

Now the issue is that while you could expect 2) to make more sense since you would expect people to recursively use smart pointers, it's actually the solution that would in my opinion surprise the programmer the most. If type* is by accident (or by misuse) a plain pointer type, then unique_ptr(type*[x]) results in undefined behavior at destruction.

1), however, would just cause a (very identifiable) memory leak if it were to be misused, and is (I think) prefered; if the type is managed, set the destructor to the sfree2 convenience function for that purpose, if the type is plain, set the destructor to some kind of similar free2; and finally, if the pointer, regardless of being plain or smart, is not to be freed, then don't specify a destructor.

alexreg commented 9 years ago

I'd be curious to hear if you have any plans for further features/updates. (I like that it's a small library right now though, and quite minimalistic as is C as a whole.)

Once I'm done with other things, I will most probably add some kind of opaque weak pointer if I find a proper way without any race condition.

I was wondering about a weak_ptr – what do you mean by “opaque” though? Since you can’t prevent aliasing of pointers in C, I would imagine it would be equivalent to a plain old pointer in C, no?

  1. For any fixed-size array type, set the default metadata to a size_t containing that size, and also set the default destructor to a function that does sfree on each element of the array (if the element type is a pointer) and nothing (if the element type is a non-pointer).

Will try. I'm not sure if gcc will let me make a conditional branching for pointer/non-pointer types though. Also, for the metadata part, I'm not quite sure what you mean, as I think I already do this -- the actual metadata is prepended with the length of the array and and the size of the base type, and both are accessible with functions in csptr/array.h

Aha. I thought you were doing stringification and parsing already, on the pointer type. How do you figure out the array size right now, for example? Also, I see no reason why the array size should be stored separately to the actual metadata. It would be conceptually simpler and no real loss just to set it as the actual metadata (unless overridden) – the user can always override it to a struct including the array size anyway.

  1. Allow the metadata and destructor for the array to be override the defaults when they are specified as normal in the unique_ptr or shared_ptr macro call.

It should be possible, but then again, I wouldn't know if the passed type is specifically a pointer, or even a pointer managed by libcsptr. I could specify some kind of universal destructor for arrays that you could pass by default though.

Indeed, a ‘template’/universal destructor for arrays would be handy – or indeed two, with one variant that does free on each element and another variant that does sfree. What I could do, maybe, is to stringify the type, then try to parse the string to see if it is a pointer type -- which is in my opinion a bit overkill; but that still wouldn't solve the problem of knowing if a pointer is a smart pointer.

So now, it's more of a preference between 1)

// void sfree2(void _ptr, void meta); uniqueptr(type[x], dtor = sfree2); // array contents are smart pointers unique_ptr(type[x]); // array contents are plain pointers and 2)

uniqueptr(type[x]); // array contents are smart pointers uniqueptr(type[x], dtor = NULL); // array contents are plain pointers It’s an intriguing question, which destructor should be default for arrays. My answer is: maybe neither! In my view, arrays should be treated just the same as pointers (as parameter types they are equivalent anyway), and the default destructor should remain a simple free operation unless overridden with sfree_array or free_array (representing the two cases above) or a user-defined function. Now the issue is that while you could expect 2) to make more sense since you would expect people to recursively use smart pointers, it's actually the solution that would in my opinion surprise the programmer the most. If type* is by accident (or by misuse) a plain pointer type, then unique_ptr(type*[x]) results in undefined behavior at destruction.

The Principle of Least Surprise is a good one, so I see your argument here… however, I reckon the destructor action should be as consistent and simple as possible, myself. Template destructors like sfree_array and free_array can then be given in a header file, for convenience.

Snaipe commented 9 years ago

I was wondering about a weak_ptr – what do you mean by “opaque” though? Since you can’t prevent aliasing of pointers in C, I would imagine it would be equivalent to a plain old pointer in C, no?

Opaque in the sense of not providing the contents of the weak_ptr struct, as it would be unsafe to access it without checking if the referred shared_ptr is alive. I was thinking of something like so:

struct weak_ptr_s;
typedef struct weak_ptr_s *weak_ptr;

// Usage
smart weak_ptr w;
{
    smart int *shared = shared_ptr(int);
    w = weak_from(shared);

    smart int *ref = weak_lock(w);
    assert(ref == shared);
} // shared ptr is destroyed

smart int *nullref = weak_lock(w);
assert(nullref == NULL);

Aha. I thought you were doing stringification and parsing already, on the pointer type. How do you figure out the array size right now, for example?

Since I have the liberty of using GNU C extensions, I'm using a simple trick I came up with using typeof. If you first consider this:

const int arrlen = 10;
int[arrlen][1] dummy;
assert(sizeof (dummy[0]) == sizeof (int));
assert(sizeof (dummy) / sizeof (dummy[0]) == arrlen);

Then you can abstract the type int[arrlen] away with typeof inside a macro:

# define example(Type, ArrLen, ArrEltSize) \
    do { \
        typeof(Type[1]) dummy; \
        *ArrLen = sizeof (dummy) / sizeof (dummy[0]); \
        *ArrEltSize = sizeof (dummy[0]); \
    } while (0)

The array of size 1 is here to make the compiler accept dummy[0] for both scalar and array types, while still having a valid sizeof since ∀ T, sizeof (T) == sizeof (T[1]).

Also, I see no reason why the array size should be stored separately to the actual metadata. It would be conceptually simpler and no real loss just to set it as the actual metadata (unless overridden) – the user can always override it to a struct including the array size anyway.

Well it is, in fact, set as the actual metadata; if you invoke get_smart_ptr_meta on an array type you get back a pointer to a s_meta_array, followed by any user metadata, if any. This means that in practice, you could have:

struct user_meta {
    int foo, bar;
};

struct array_meta {
    s_meta_array;
    struct user_meta;
};

smart int* ptr = unique_ptr(int[10], .meta = { &(struct user_meta) { 1, 2 }, sizeof (struct user_meta) });
struct array_meta *meta = get_smart_ptr_meta(ptr);

But I would recommend to use array_user_meta to retrieve the user metadata instead of this technique, and both array_length and array_type_size for the array metadata.

It’s an intriguing question, which destructor should be default for arrays. My answer is: maybe neither! In my view, arrays should be treated just the same as pointers (as parameter types they are equivalent anyway), and the default destructor should remain a simple free operation unless overridden with sfree_array or free_array (representing the two cases above) or a user-defined function. [...] The Principle of Least Surprise is a good one, so I see your argument here… however, I reckon the destructor action should be as consistent and simple as possible, myself. Template destructors like sfree_array and free_array can then be given in a header file, for convenience.

That is also my view, although I explained it poorly. I plan to provide the two universal sfree/free destructors for array and just let the user specify it if needed.

alexreg commented 9 years ago

I was wondering about a weak_ptr – what do you mean by “opaque” though? Since you can’t prevent aliasing of pointers in C, I would imagine it would be equivalent to a plain old pointer in C, no?

Opaque in the sense of not providing the contents of the weak_ptr struct, as it would be unsafe to access it without checking if the referred shared_ptr is alive. I was thinking of something like so:

struct weak_ptr_s; typedef struct weak_ptr_s *weak_ptr;

// Usage smart weak_ptr w; { smart int *shared = shared_ptr(int); w = weak_from(shared);

smart int *ref = weak_lock(w);
assert(ref == shared);

} // shared ptr is destroyed

smart int *nullref = weak_lock(w); assert(nullref == NULL); That seems fair enough. Is the point of this opaque weak pointer simply to guarantee liveness then? I wonder if there is a compiler extension that allow a either a “check before access” function or implicit casing from a struct to a pointer (unboxing); this would help ease the pain of the syntax. Aha. I thought you were doing stringification and parsing already, on the pointer type. How do you figure out the array size right now, for example?

Since I have the liberty of using GNU C extensions, I'm using a simple trick I came up with using typeof. If you first consider this:

const int arrlen = 10; int[arrlen][1] dummy; assert(sizeof (dummy[0]) == sizeof (int)); assert(sizeof (dummy) / sizeof (dummy[0]) == arrlen); Then you can abstract the type int[arrlen] away with typeof inside a macro:

define example(Type, ArrLen, ArrEltSize) \

do { \
    typeof(Type[1]) dummy; \
    *ArrLen = sizeof (dummy) / sizeof (dummy[0]); \
    *ArrEltSize = sizeof (dummy[0]); \
} while (0)

The array of size 1 is here to make the compiler accept dummy[0] for both scalar and array types, while still having a valid sizeof since ∀ T, sizeof (T) == sizeof (T[1]).

That's interesting. Kind of hacky, but interesting nonetheless. :) Also, I see no reason why the array size should be stored separately to the actual metadata. It would be conceptually simpler and no real loss just to set it as the actual metadata (unless overridden) – the user can always override it to a struct including the array size anyway.

Well it is, in fact, set as the actual metadata; if you invoke get_smart_ptr_meta on an array type you get back a pointer to a s_meta_array https://github.com/Snaipe/libcsptr/blob/master/include/csptr/array.h#L30-L33, followed by any user metadata, if any. This means that in practice, you could have:

struct user_meta { int foo, bar; };

struct array_meta { s_meta_array; struct user_meta; };

smart int* ptr = unique_ptr(int[10], .meta = { &(struct user_meta) { 1, 2 }, sizeof (struct user_meta) }); struct array_meta *meta = get_smart_ptr_meta(ptr); But I would recommend to use array_user_meta to retrieve the user metadata instead of this technique, and both array_length and array_type_size for the array metadata.

I see. Personally, I still don’t like this “special handling” for (fixed-size) array variables, and would probably not bother with ‘smart arrays’ at all, but if you’re going to implement them, then I suppose your implementation is a pretty good one! It’s an intriguing question, which destructor should be default for arrays. My answer is: maybe neither! In my view, arrays should be treated just the same as pointers (as parameter types they are equivalent anyway), and the default destructor should remain a simple free operation unless overridden with sfree_array or free_array (representing the two cases above) or a user-defined function. [...] The Principle of Least Surprise is a good one, so I see your argument here… however, I reckon the destructor action should be as consistent and simple as possible, myself. Template destructors like sfree_array and free_array can then be given in a header file, for convenience.

That is also my view, although I explained it poorly. I plan to provide the two universal sfree/free destructors for array and just let the user specify it if needed.

Ah okay, very good. I think that’s the best way.