Open gavinking opened 6 years ago
Writing not much, I had encountered one or two cases where I’d be glad to have ?[]
operator with ?
like in ?.
—exactly what is proposed for []
to change to in the first suggestion, so it would at least simplify these cases as a collateral.
If the []
operator worked as requested in #4517, one could make a multidimensional-array class whose []
operator always returned a Correspondence, just one that at the final dimension would always return null
if any earlier dimension was out-of-bounds. (Though that wouldn't help with the comprehension-based example above.)
@kingjon3377 Yes that might work, and it's a good idea.
We would still have to introduce this new multidimensional-array class somewhere (or perhaps just two classes, one for 2D arrays, and one for 3D arrays), but perhaps that's a good idea anyway, since it would allow optimized underlying storage and APIs.
For example:
value arr = Array2.create(3, 3, (i, j) => i==j then 1 else 0);
Integer? elem = arr[1][2];
Array row = arr[1]; //non-null
Where arr[1]
is non-null
because Array2D
returns an empty (not null!) Array
for out-of-bounds indexes.
Still, I'm not sure I'm really comfortable with those empty Array
s. And I dunno, but I feel like even if we did have these multidimensional-array classes, the [i, j]
syntax is still somehow more natural:
value arr = Array2.create(3, 3, (i, j) => i==j then 1 else 0);
Integer? elem = arr[1, 2];
Array? row = arr[1,]; //horizontal slice
Array? col = arr[,2]; //vertical slice
However, TBH, I'm not certain how to abstract over slicing for arbitrary dimensions. One way would be to say that the key is a tuple of indexes, but that would exhibit terrible performance. Another way is something like the previously-described Option 3.
@gavinking, I really don’t like your Array2
idea. It works well for bidimensional arrays, but that’s simply not where people’s necessities end. It’s just not scalable, and I think that having Array2
, Array3
, etc. up to an arbitrary limit is just as bad of an approach for this issue as Tuple2
, Tuple3
, etc. is for tuples.
What’s wrong with the quite explicit arr[i]?.get(j)
again?
I dig Option 3 for the use case of needing a perfectly rectangular array (or rectangular prism, hyperprism, etc.) and I like the syntax for indexing into it. I'm not sure what I'd use it for, right now, but it'd be nice to have it available, in case the need came up. (Maybe some image processing or something?)
There are two cases. Either the model is a list of lists, or the model is a rectangle (or hyperrectangle). The programmer knows which. As to gifting syntax to the first case, I am reminded of that line from Ceylon's motivational preamble, what was it? "Rather clarity than a sea of argot ASCII" ? So I want to advocate for what Ceylon already has in this example (i.e. the type rules on the x[i]
operator).
The want for "clear, concise" syntax, for indexing into a list-of-lists that you know might be an OoB, trades off with the mission to be explicit about those possible Null
s. This is where I like Ceylon for making some things "difficult". I would say "conscious". I also think this is where the solutions are for the application to provide. Again, this list of lists is not rectangular; the application's purpose would decide how much of a fuss should be made over its handling of an OoB, not to mention the semantics of doing so. So let them work through Ceylon's highly regular, strongly typed, rich system to obtain their objective - and they'll get a Ceylon source file for it.
In the hyperrectangle case, I'm imagining a language module artifact that would have the conveniences, maybe even optimization. It's different.
I hope I haven't made myself look too foolish.
Is there anything speaking against option 1? It's simple and natural, and avoids special casing.
I am kind of rooting for option 1 as well—This is more or less what people will try right away and it would behave in a least surprising manner.
Well, sure, the problem with option 1 is that the performance is potentially much worse.
How is OoB detected on the JVM & JS in Ceylon? In traditional Java, an exception is thrown by the JVM. In JS, you get undefined.
I'll remind of a comment in #4517 where a slightly-breaking change with a (defaulted) type argument indicating whether Correspondence.get()
is optional or not. Alternatively an OmniCorrespondence
subinterface of Correspondence
could force the return value of e.g. get()
to be non-optional.. :trollface:
Anyway I'm getting wild ideas like*):
Array<String, 2>
where the number of dimensions is a type argument (think of integers as enumerated instances of Integer).. Correspondence<Key, Item, Absent=Null, Dimensions=1>
perhaps? Then the type of get()
could be Item(Key[Dimensions])
. To get a sub-correspondence with only a prefix of the dimensions given would be accessed through another method which accepts an arbitrary number of dimensions.
Array<Array<String,1>,1>
vs Array<String,2>
although in a very clumsy way :)[1,2,3]
is a bidirectional alias for [1][2][3]
or maybe [1]?[2]?[3]
(TODO ?[]
operator)[1][2]...[k]
given k<n
returns a Correspondence|Absent
regardless of storage modelSIDE NOTE: where I come from, x[i,j]
would resolve to x[j][i]
but I am ready to take the plunge!
*) because I'm not yet skilled enough to know it won't work
Ceylon doesn't have true multidimensional lists, but it is possible to work around that using a list of lists as in Java. This works out just perfectly for multidimensional tuples:
However, for other multidimensional lists (including multidimensional arrays) the stacked indexing operator doesn't work:
I can see several possible solutions to this:
x[i]
operator accept aCorrespondence?
as its LHS instead of onlyCorrespondence
as it does today.x[i,j]
for things of typeCorrespondence<Correspondence<...>>
, wherex[i,j]
means whatx[i][j]
would mean if it were well-typed.Array
with its dimensions, for example,Array.ofDimensions(2, 3)(0.0)
and introduce a multidimensional lookup operator of formx[i,j]
wherex[i,j]
meansx[j+3*i]
.Option 1 is of course the simplest change.
Options 1 and 2 work for any kind of
Correspondence
and support the current pattern of using lists of lists (of lists, etc).Option 3 recognizes that lists of lists probably aren't really a very efficient way to lay out multidimensional arrays of numbers, but only really works for
Array
s. Here, a "2D" array ofFloat
s would be represented as anArray<Float>
, but its elements could still be accessed as if they were laid out in two dimensions usingarr[i, j]
.Thoughts?