Closed Dandigit closed 4 years ago
Or maybe an even shorter operator?. It's also been discussed to actually have it postfix, so arr.length or arr.len.
That begs the question - if elemsof
was become postfix, would sizeof
also become postfix in the name of consistency?
Also, with a postfix operator that starts with .
, e.g. .len
, .size
, it blurs the line in terms of what the .
actually means. In all other contexts in C2, it separates a parent (e.g. module, struct, union) from a member. In this context, it would just be part of the operator's name.
There are seemingly infinite things to consider when designing such a small change.
Definitely. This is something that @bvdberg has brought up.
An advantage is that it frees a lot of keywords that otherwise might start ballooning as one looks at the many compile times that’s actually available in GNU/Clang.
I think Zig should be a warning here. It has a huge list of sizeof-like compile time functions. Most of those are really useful but at least I find them intimidating since they cannot be grouped in a suitable manner.
Using the . we can actually think of it as calling an automatically generated function.
That said, I’m not sure about the direction. I see both pros and cons with either way.
I think there was a forum post discussing this among others, but I can't seem to find it. The elemsof was chosen because it looks like sizeof. I agree that is doesn't look really nice, but it Is un-ambiguous. len or size are not.. What is the len of an array? The total byte size of the number of elements? We discussed that using a dot operator for these would perhaps be better:
pro: it removes the global symbols elemsof/sizeof etc. con: it prohibits the use of those names as struct members pro: Type.size() looks better than sizeof(Type) (my personal opinion, a matter of personal taste).
I cannot think of major issues by choosing this strategy (but that's why they are called un-foreseen issues :) )
Let me list a few possibilities:
sizeof( ... )
, elemsof( ... )
@sizeof( ... )
, @elemsof( ... )
i32.size()
, a.size()
i32->size()
, a->size()
i32.size
, a.size
i32->size
i32:size
i32$size
Advantages and disadvantages can be discussed for each.
Out of all of these approaches, I think number 1 (function style), and number 3/5 (struct function/member style) are the worst.
.size()
/.size
, and add ambiguity as to what the .
operator actually does.Out of the remaining approaches, number 2 and number 4 are the worst.
->
sigil would be a poor choice in a language targeting C programmers - it's like accessing a property of a pointer of a struct, which is definitely not what's going on here.That leaves number 6. I feel like the language needs syntax to explicitly say "this is accessing a compile time attribute". i32:size
is different syntax that does not:
It gives the language a way to explicitly express compile time attributes. When a programmer thinks "Hmm... I need \<compile time attribute> of \<something>", they'll know to use :
.
Everything I've said is completely subjective, and I'm no expert, so take it with a grain of salt.
The :: and -> approaches are not in the language currently and my preference is to keep them out.
One possible issue with the dot approach I thought of was the sizeof for base types, so i32.size(). This does currently not parse, since an expression cannot start with a base type.
Point taken. The difference between Rust macros and this case, however, is that macros cannot be called on an object/type with the .
operator.
The .sizeof()
and .elemsof()
approach isn't very pleasant IMO. Look at how different approaches read in English:
sizeof(int) => size of int
int.size => int size
int.sizeof() => int size of
"int size of" isn't really desirable.
The parseability of the dot approach is definitely a valid issue. You certainly could dive into the code and allow T.size()
as a special case, however with special cases comes inconsistency.
Whatever approach is taken, I strongly believe that there must be something to indicate that size
and length
are compile time properties, such as a sigil.
(1) There are a bunch of reserved keywords in C. It's just that they're prefixed with _ and then used through their macro: https://www.c-programming-simple-steps.com/c-keywords.html. If we look at Zig's built in functions, then these would be keywords if implemented in C: https://ziglang.org/documentation/master/#Builtin-Functions – I'm not saying that we need this zoo of built-ins, but just to show there is an argument for keywords increasing rather than decreasing.
By the way, if the .size-approach is used I prefer it to look like i32.size rather than i32.size(). The reason is that I prefer to associate () with runtime evaluation. ".size" signals to me that this is some way is constant during runtime and is safe to use without an unnecessary performance hit – which is exactly what happens.
@Dandigit the difficult thing is finding good sigils. Using @ quickly gets noisy, especially when used for macros as well. I don't like "!" at all since that signals exception handling for me and I want to reserve that in case it's needed. In general postfix sigils are harder to see as well. Do you have any suggestions?
(I obviously agree on i32.sizeof being bad)
@lerno I completely agree that @
is noisy and that !
is too established as "exception incoming".
My personal preference for a sigil is an apostrophe: '
.
i32.'size
'sizeof(i32)
a.'size
It's not perfect though, a simple '
can be quite easy to miss. I suppose it's something you'd learn to look for.
The apostrophe would probably be the last one I’d use due to its association with strings.
In general C is a difficult language for sigils since so many operators use characters already. Even operators have different meaning in an expression already: &
and *
being the most frequently occuring ones.
So finding a good sigil that people can agree on will be a though job.
I can't believe that the apostrophe's association with strings/chars crossed my mind - thanks for pointing it out. Definitely not preferable.
There are not so many possible ascii characters that are suited here. Looking at the ASCII table: ! $ % @.
None seem really nice. When I was thinking about this, I also came up with another solution.
Since sizeof/elemsof both produce compile-time constants, we could force their usage with a capital
case (That's also mandatory for other constants in C2). So again either .Sizeof()
or .Sizeof
, .Elemsof
. Since no struct member can start with a capital char, no clashes are possible.
I think I either prefer Sizeof(x)
or x.Sizeof
. So x.Sizeof()
is out IMO, since it's not a function.
That's not a bad idea - it definitely cleans things up a bit while still making it clear that Size
is a constant. With this approach, we've got the following candidates:
T.Size
instance.Size
Sizeof(T)
Sizeof(instance)
array.Length
Lengthof(array)
array.Elems
Elemsof(array)
I'm still not sold on Elemsof
/Elems
- you say that it is unambiguous compared to Length
/Lengthof
but I digress.
x.Elems
or Elemsof(x)
read as x elements
and elements of x
respectively. The issue here is that this operator does not return the elements of an array, rather it returns the amount of elements. If this operator were to return the elements of an array, it would just return the array itself.
The amount of elements in an array is commonly expressed as its length, which is why x.Length
and Lengthof(x)
are less ambiguous than x.Elems
and Elemsof(x)
.
If we wanted to eschew ambiguity completely, we'd end up with x.AmountOfElems
and AmountOfElemsOf(x)
which are both quite ridiculous.
I personally think that the uppercase there is a bit of an eyesore. That we even run into the issue is because pointer struct access and direct struct access is the same.
IMHO,
elemsof
is an unsuitable name for the operator which calculates the length of an array.It doesn't make much sense grammatically or logically - this operator returns the length of an array, not the elems/elements of an array.
lengthof
would be a far more suitable name for this operator. It's an operator which literally returns the length of an array.@bvdberg, if you approve of this change, I'm happy to implement it.