Closed KrzysztofCwalina closed 7 years ago
cc: @terrajobst, @cartermp
Wow! That's a change!
So there will be abstract System.Array
, ImmutableArray<T>
, Array<T>
and ReadOnlyArray<T>
?
I guess this is the implication.
For the context, I filed the issue because:
:+1:
Combined with full language and JIT support slices could (almost) completely replace usage of arrays. At least in the new code.
int[:] indexes = new int[10];
So the name looks appropriate.
Combined with full language and JIT support slices could (almost) completely replace usage of arrays.
Indeed, if such support is added and if new language features (ref
returns) are used so this type more closely resembles the classic array then Array<T>
could be a winner. Otherwise I think such a name should be avoided because it would be confusing.
Since I doubt arrays are ever going away, I don't think Array<T>
is a good name. It would be confusing especially when speaking: how do you figure out which type is "array of ints"?
To contrast, similar situation worked out well for generic collections: it's pretty clear that "queue" generally refers to System.Collections.Generic.Queue<T>
and not System.Collections.Queue
, since nobody uses that.
It may be too long, but what about ArraySlice<T>
?
It would be confusing especially when speaking: how do you figure out which type is "array of ints"?
I think the end goal here is that you shouldn't really have to be asking that question in the first place.
In Go or Rust, for example, you create arrays at the top level, but the vast majority of meaningful work you do is on a slice of that array (even if the slice happens to be the whole thing). The way they are reasoned about isn't as this other thing that is backed by an array, but the chunk of the array I care about. Rust ex:
fn print_size(slice: &[i32]) {
println!("slice has {} items", slice.len());
}
fn main() {
let xs: [i32; 5] = [1, 2, 3, 4, 5]; // fixed-size array of 5 elements
print_size(&xs); // prints 5
print_size(&xs[1 .. 4]); // prints 3
}
When I write this kind of code I never ask which type is the array of ints. Although the semantics here are a bit different than typical C# code, the underlying reasoning I do doesn't involve distinguishing between types.
It may be too long, but what about
ArraySlice<T>
?
This is the current way that Swift handles things. Although I haven't written much Swift code, the impression I get is that the friction between Array<T>
and ArraySlice<T>
is pretty low. That's certainly helped by the tendency not to explicitly annotate types and [..]
syntax, but even without the language-level syntax the types place nicely with one another.
I think it would be confusing since Array is a well established data structure going beyond .Net and does not only implies indexing but also always represents a complete view on the stored data. Slice is an encapsulation of a range of indexable items, which is separate from the actual backing storage.
I think the data structure should be called Range
and operation that produces Ranges, called Slice
.
In most cases we will not utter the actual type (like we rarely mention System.Int32
), but it would help if the name of data type would match its semantics.
int[] data = new int[100];
// slice a range of 3 to 9 elements inclusively and pass that to Foo
Foo(data[3:9]);
// same as
int[:] range = data[3:9];
Foo(range);
// same as
Range<int> range = data.Slice(3, 9);
Foo(range);
@KrzysztofCwalina side note: should the title be Span<T>
instead of Slice<T>
?
@cartermp, fixed.
Is the restriction that a method cannot have the same name as it's enclosing type just a C# limitation or is that a CLR restriction? Just some speculating but I think it's a C# limitation as the following are legal...
class ToString{}
delegate int Invoke();
Enum A{ A}
I think slice is the best name. Can we just remove the restriction for this class?
What about View<T>
? C++17 has array_view
/string_view
in its STL.
@jamesqo To me, ArrayView<T>
sounds okay, but View<T>
is too non-specific: it could mean any number of things.
I agree with @jamesqo. To me ArrayView<T>
seems a bit too verbose for my taste, but I don't think this is a strong enough argument against it.
ArrayView<T>
seems ok to me too.
A bit verbose, but I think we do not expect to see the actual type name too often.
Especially assuming there will be special syntax like T[:]
A bit verbose, but I think we do not expect to see the actual type name too often. Especially assuming there will be special syntax like
T[:]
While that is likely true, I don't think we should rely on language features when naming APIs. As you know, there is a high chance that the language feature either doesn't happen or happens much later. I think as a best practice we should design our APIs without having to rely on language features to hide any quirkiness (not just naming). Of course, we should leverage the compiler and the languages to do things that are hard (such as creating state machines or doing reachability analysis) but not for putting lip stick on a pig.
I would prefer ArraySlice<T>
to ArrayView<T>
, as this is because of languages which already use the "slice" terminology (Python, Go, Rust, Swift). The C++ name seems unique to C++17.
If we're going to make the differentiation between slices and "sliceable" collections be conceptually distinct, then keeping "slice" in the name makes it easiest to understand without having to consult an API reference.
@cartermp Perhaps we should name the normal version ArraySlice
, and the read-only version ArrayView
? Then people making a slice of a string could write ArrayView<char>
, instead of something like ReadOnlyArraySlice<char>
(which IMO is pretty verbose).
Also, maybe Range<T>
would also be a good name? Then we could name the types Range<T>
and View<T>
(or ReadOnlyRange<T>
) without conflicting with the Slice
method.
Range
My preference would be ArraySlice
.
In general, it should not be assumed that the result of a slice operation is always a slice/ArraySlice
:
str[0:3]
be an Array<char>
would be very confusing.vector[0:4]
to be a vector again, not a slice. You could also have slices of 2D arrays, like matrix[0:4,2]
, which would also return a vector.In my view, the ideal API would use indexer properties that take as an argument a lightweight structure (Slice
?) that is constructed from the slicing syntax. In addition, I'd allow extension methods with a specific name (GetSlice
as in F#?) This approach has several advantages:
Slice
type. (I really don't think there's a need to clutter the language with the type [:]
syntax.)@JeffreySax
In general, it should not be assumed that the result of a slice operation is always a slice
I buy that which is why I believe having a single term is useful because we can combine it with other concepts or in other places where Xxx<T>
wouldn't be appropriate, such ReadOnlyXxx<T>
or StringXxx
.
In the language, all you need to do is map the slicing syntax to the Slice type. (I really don't think there's a need to clutter the language with the
type [:]
syntax.)
Not sure what map the slicing syntax to the Slice type would mean if there is no syntax. In general, I like the idea of mapping the indexer syntax [:]
to calling a Slice
method (which can be provided as an instance method or an extension method). The return type could be arbitrary -- what matters is the name and shape of the Slice
method. That would be akin to how the language deals with other constructs, such as foreach
and await
.
That being said, I wonder whether there shouldn't also be a syntax for the slice type. For instance, it would be nice to be able to declare a type using similar syntax, such as:
int[] numbers = {1, 2, 3, 4, 5}
int[:] threeFourFive = array[2:3];
string text = "Hello";
string[:] el = text[1:2];
I see two options there:
T[:]
to a particular slice type, e.g. Xxx<T>
. This would prevent us from using the synatx for specialized slice types such as StringXxx
.T[:]
as the return type of the T.Slice
method. Not sure the language designers like the fact that types are bound based on the shape of members.In general, I think syntax matters a lot when it comes to how features are being perceived.
I think that once you make the result of a slicing operation of arbitrary type, then a special syntax for the slice type is very close to useless and certainly not worth the trouble. Your first option is too restrictive, and in your second option it looks like T
is the collection type, which is different from array syntax T[]
where T
is the element type.
Not sure what map the slicing syntax to the Slice type would mean if there is no syntax. In general, I like the idea of mapping the indexer syntax [:] to calling a Slice method (which can be provided as an instance method or an extension method).
I mean that the expression a:b
is pure syntactic sugar and should map to a type, say SliceIndex
, that is a lightweight struct. So
a[1:4] = b[2:5]
maps to
a[new SliceIndex(1,4)] = b[new SliceIndex(2,5)]
where you've defined:
public MySlice this[SliceIndex index] { get; set; }
The biggest problem when you don't use such an index type is that it doesn't work for multi-dimensional arrays. How do you differentiate between m[1,2:3]
and m[1:2,3]
? F# used options, which has the advantage that you can omit boundaries (v[1:]
or even v[:]
), but the huge drawback that options are reference types. If you want this capability - and I do - then it's much better to let the compiler insert the usual default value:
You can also use the SliceIndex syntax in loops if you want:
foreach (var i in 0:9999) { /* ... */ }
Just to add my 0.02, Slice as it stands is nothing more than ArraySegment for the most part. IList Adapter can already do this as well as signal changes to the underlying collection. Generic.IList misses that but you can easily add it. What happens when an array is resized? What if the sliced data no longer exists? I think what people are after is a way to pass the slice to a method which takes array (without an offset and length). If that's the case Array could be modified to have a private offset and length would reflect the number of elements. It's easy to see the syntax being parsed and passed to type so that the syntax desired is achieved. Array could also be subclasses before it was sealed and the subclass could be used for slicing and whatever else essentially replacing array in runtime use behind the scenes. Not sure in the end this gains anything though performance or otherwise (Slicing) as the consumer can still just do array [-1] or otherwise. They could also just be given the offset and length as traditional...
Just as a FYI, I penned something out to deal with the Resizing issue among other things.
The result is @ In my projects source
It's not 100% complete yet as it doesn't deal with strings correctly.
I also have an idea about not even copying memory by forging an CLR array header and placing it right before the offset of the first desired element which makes the array think it has X elements which start directly after the header...
Lots of good discussion on this, but I don't think we have found the perfect name, so not worth renaming.
Any thoughts on this from the community?