Closed rbuckton closed 4 years ago
Becomes this at runtime:
array[Symbol.slice](1:3:2)
Was this meant to be array[Symbol.slice](1, 3, 2)
?
Yes, thanks. I've updated the issue.
"slice" to me makes no sense as a concept applied to things that aren't lists (such as arrays, strings, Sets) or things without indexes.
If we want a generic extraction API, we should call it something else, and it shouldn't solely use numbers.
@ljharb we can bikeshed on Symbol.slice
, but my point is that Array
, String
, and Set
aren't necessarily the only "list"-like things in JavaScript, as users can define their own "list"-like classes that would like to use this feature. The name Symbol.slice
was chosen in this case as the proposal defines this as "slice notation".
I wonder if we might want to dust off the Symbol.geti
/Symbol.seti
proposal as well, and consider adding a Range
primitive with literal syntax:
// built-in `Range` class
class Range {
constructor(start = 0, end = -1, step = 1) {
this.start = start;
this.end = end;
this.step = step;
}
[Symbol.geti](obj) {
return obj[Symbol.slice](this.start, this.end, this.step);
}
[Symbol.seti](obj, values) {
return obj[Symbol.splice](this.start, this.end, values);
}
}
// Literal `Range` syntax:
let range = 1:3;
// -> range = new Range(1, 3);
// Get a range
let source = [1, 2, 3, 4, 5];
let chunk = source[range];
// -> chunk = range[Symbol.geti](source);
// -> chunk = source[Symbol.slice](1, 3, 1);
// -> chunk = [2, 3]
source[range] = [7, 8, 9];
// -> range[Symbol.seti](source, [7, 8, 9])
// -> source[Symbol.splice](1, 3, [7, 8, 9])
console.log(source); // 1, 7, 8, 9, 4, 5
While there would definitely be some indirection under the covers, its very flexible, consistent, and cohesive.
One caveat is that a literal range syntax would be ambiguous in a conditional, so you would have to require parens for a literal range expression (e.g. x ? (1:2) : (3:4)
).
@rbuckton nice idea
Most other languages have that start : end[ : step]
syntax (sometimes start[ : step]: end
), but I find the step argument not very useful. Replacing it by a callback would have benefits (performance, ..) even if it looks weird at first glance
1:9:2
would become 0:4:i=>1+i*2
(1:10).map(() => 100*Math.random())
would become 1:10:() => 100*Math.random()
1:9:2
would become0:4:i=>1+i*2
This seems like it would be too complicated for the array selector case, compared to this:
const ints = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const odds = ints[0::2]; // [1, 3, 5, 7, 9];
const events = ints[1::2]; // [2, 4, 6, 8, 10];
Besides, you could already map with Array.from
:
const odds = Array.from([0:5], i => (i * 2) + 1);
I wonder if we might want to dust off the Symbol.geti/Symbol.seti proposal as well, and consider adding a Range primitive with literal syntax:
The problem with adding a new Range primitive is that would complicate GetValue/PutValue, regressing performance for all property access.
The win with just the slice notation is that it's just syntax which can be directly rewritten in the parser to be a call out to Symbol.slice and we can reuse all the magic sauce we have with optimizing regular property access. You only pay for call out to Symbol.slice if you use slice notation, not every property access. Also, since this is just syntax, we can easily optimize this with ICs.
The problem with adding a new Range primitive is that would complicate GetValue/PutValue, regressing performance for all property access.
Hosts like v8 and Chakra already optimize property access and have opt-outs for non-PropertyKey values (e.g. obj.foo
is fast while obj[{ toString() { return "foo"; } }]
is slow, but both work).
There's always at least an extra type check (load + jump) required to bailout on the fast path.
also Range
name is taken https://developer.mozilla.org/en-US/docs/Web/API/Range by text selection API
and document.createRange
as well
In addition to the issues Sathya raised, it seems somewhat complicated grammatically to give :
yet another meaning outside of a somewhat restricted context, given its other usages.
This possible new meaning for :
would be restricted inside array literals notation, I don"t think it complicate things much for the parser, does it?
@rbuckton I think it's very frequent to need a .map
just after (or Array.from
like you said, but it's quite verbose) than this step parameter. That step parameter is just like a .filter
in less powerful too. But in my proposal, it'd be confusing to pass a function expression as 3rd parameter, above all if the first 2 only accept number literals (#26), so I'm fine with this [start:end:step]
after all
This possible new meaning for
:
would be restricted inside array literals notation, I don"t think it complicate things much for the parser, does it?
That's what I'm proposing, but not what @rbuckton seems to want according to https://github.com/tc39/proposal-slice-notation/issues/19#issuecomment-415995994.
@gsathya: I assume you are referring to this: let range = 1:3
? In effect I'm saying it would be a "nice to have". If we ever did decide to add https://github.com/tc39/proposal-slice-notation/issues/19#issuecomment-415995994, it could be achieved as a series of follow-on proposals:
a[start:end:step]
.@@slice
and @@splice
:a[1:3]
-> a[@@slice](1, 3, 1)
a[1:3] = b
-> a[@@splice](1, 3, b)
@@geti
and @@seti
:a[x]
-> x[@@geti](a)
a[x] = b
-> x[@@seti](a, b)
1:3
):
a[1:3]
a[new Range(1, 3, 1)]
new Range(1, 3, 1)[@@geti](a)
a[@@slice](1, 3, 1)
To avoid ambiguities with conditional and labels, we could restrict ranges to element access (a[1:3]
) and parenthesized expressions ((1:3)
).
Also, if Range
(or whatever name we choose) supports @@iterator
, you could easily create arrays of ranges, or for..of
over a range:
// create array
const ar = [...(1:5)]; // [1, 2, 3, 4]
// or, allow without parens in array
const ar = [...1:5]; // [1, 2, 3, 4]
for (const x of (0:10)) { // 0, 1, 2, ..., 9
}
I don't expect [...1:4, ...6:10]
cases to be frequently used, but it's indeed nicer than [...[1:4], ...[6:10]]
for (const x of (0:10))
doesn't simplify much for (const x of [0:10])
I see how this range literal is fitting well in this proposal, this looks great
for (const x of (0:10))
doesn't simplify muchfor (const x of [0:10])
Except that iterating over a Range would be far less memory intensive:
for (const x of (0:Number.MAX_SAFE_INTEGER)) {
// only need to hold four numbers (start, end, increment, and current) and the Range object in memory
}
for (const x of [0:Number.MAX_SAFE_INTEGER]) {
// need to hold an Array object with 9,007,199,254,740,991 numbers in memory!
}
@rbuckton would the range literal expose methods like .map
, .filter
?
This would be interesting:
(1:8).map(x => x**2)
(0:5).map(i => (0:5).map(j => 5*i+j))
If not, it's still possible to spread it of course
[...1:8].map(x => x**2)
[...0:5].map(i => [...0:5].map(j => 5*i+j))
[...] would the range literal expose methods [...]
No, I wouldn't expect it to.
@rbuckton new Slice(1, 3) // (1:3)
could be a good name maybe for this new literal constructor (since Range's taken)
Should we make a PR for this, to sum it up?
Slice.prototype[@@splice]
might seem a little strange though. What about Interval
(https://en.wikipedia.org/wiki/Interval_(mathematics))?
I'm confused, why would we want a splice symbol? splice is abomination.
It seems odd to have x = ar[1:3]
without the inverse ar[1:3] = x
.
I find the former intuitively useful and the latter violently unpalatable; i don't see an advantage to syntax that creates a ton of observable operations and also represents what's become a very unidiomatic pattern (optional chaining has no plans to add optional assignment, for comparison).
I feel x[range]
could cause confusion. JS programmers always treat x[y]
as a simple property lookup and I believe we'd better keep it simple. Instead of inventing new syntax let range = 1:3; let chunk = source[range];
I'd rather simply use let range = [1, 3]; let chunk = source.slice(...range);
.
A syntactical expression (foo[1:3]
) is always better than an 'API'/dynamic one (foo.slice(1,3)
). Just like [1, 2]
would be better than Array(1, 2)
. Because it can throw if it's malformed, it can allow perf optimizations I guess, ...
The biggest benefit, for me at least, is the range creation discussed in this issue, because Array.from({length: ..}, (_, i) => ...)
becomes common. For example, 13 occurrences of Array.from({ length
in https://github.com/30-seconds/30-seconds-of-code. And it's awkward, error-prone, unpractical, verbose, simply a bad sign (https://github.com/graphql/graphql.github.io/pull/456#discussion_r199057305). So [...0:10]
would be a great addition to the language
Don't make me wrong. I think foo[1:3]
form is an acceptable syntax sugar. But I think foo[range]
is not a good idea just like current proposal do not allow foo[complexExperssion1:complexExpression2]
.
@rbuckton what does the i
mean in @@geti
, @@seti
?
Other thing, for @@splice
:
const a=[]; a[2:6:2] = 4; // a will be [undefined,undefined,4,undefined,4] or still [] ?
const a=[1,1,1,1]; a[1:3] = [2, 4]; // would an array be 'spread'?
// so a would be [1,2,4,1]? or [1,[2,4],[2,4],1]
I guess the latter, so it could only assign a same value to a range of indexes
Concerning the Follow-on Proposals:
2.1
a[1:3]
-> a[@@slice](1, 3, 1)
a[1:3] = b
-> a[@@splice](1, 3, b)
I guess you mean a[@@splice](1, 3, 1, b)
or it could maybe also accept a[@@splice]((1:3), b)
or a[@@splice](new Range(1, 3, 1), b)
2.2
I find @@geti
, @@seti
redundant with 2.1, just by switching the Range and the target array, I don't think Range should have this responsibility, it should just be 'read-only' and iterable
Personally I'd drop them, (so 2.3. iii
as well)
For the naming, Interval sounds too generic since it's a more particular integer interval here, Sequence could fit, but I think we should keep Range/range, and maybe have it attached to Array, new Array[Symbol.range](1, 8, 2)
to avoid any conflict with DOM Range
Hope we can merge that to the proposal, I was trying to see how to implement a babel plugin for it
@caub
what does the i mean in
@@geti
,@@seti
?
In this case, "inverted". Basically, the semantics of @@geti
would invert the [[Get]] operation from obj[key]
to key[@@geti](obj)
, giving key
the ability to determine how to get the value from obj
.
A good example for @@geti
and @@seti
would be WeakMaps:
WeakMap.prototype[@@geti] = function (target) { return this.get(target); }
WeakMap.prototype[@@seti] = function (target, value) { this.set(target, value); }
const weakPropertyX = new WeakMap();
const obj = {};
obj[weakPropertyX] = 1;
console.write(obj[weakPropertyX]); // prints 1
There are plenty of other use cases for @@geti
/@@seti
as well:
function pick(...names) {
return {
[Symbol.geti]: (obj) => names.reduce((result, name) => (result[name] = obj[name], result), {}}
[Symbol.seti]: (target, source) => { for (const name of names) target[name] = source[name]; }
};
}
const obj = { a: 1, b: 2, c: 3 };
// pick properties to read from `obj`
const obj2 = obj[pick("a", "c")];
obj2; // { a: 1, c: 3 };
// pick properties to write to 'obj'
obj[pick("a", "b")] = { a: 4, b: 5 };
obj; // { a: 4, b: 5, c: 3 }
The @@geti
/@@seti
methods would be a convenient and consistent mechanism for all of these cases (including a Range).
@ljharb while I understand your concern about @@splice
(and most languages that implement some kind of array slice notation don't support this either), I do wonder about the inconsistency of not having it:
a = b; // regular assignment
[a] = [b]; // destructuring assignment
a[0] = b[0]; // regular assignment
a[1:3] = b[1:3]; // not supported?
so a[1:3] = 2
is invalid right? it has to be a[1:3] = [4, 4]
for example
I guess a[1:3] = [4]
would assign 4 to a[1]
and undefined to a[2]
or would it leave it the same?
and a[1:3] = a[1:3:-1]
would switch items :)
It's another reason to not apply this slice-notation to strings, since setter/splice wouldn't make sense for them. But it would still be very interesting to have @@slice
and @@geti
for strings
@caub:
a[1:3] = 2
would probably be invalid because 2
is not an array or iterable (see below).x[a:b] = z
, I had imagined the semantics would be something like x.splice(a, (b - a), ...z)
: The elements at x[a:b]
are removed from x
and the elements in z
are inserted in their place. This is also why @@splice
ignores the "step" argument, because all of those elements would be replaced.Also, removing a section of the array could be something like a[5:10] = []
but splice wouldn't support the step (in start:end:step
)? I mean it gets very confusing:
a=[1,2,3,4,5,6]; a[0:4] = [7,8,9,10,11]
would transform a
in [7,8,9,10,11,5,6]
, just like Array.prototype.splice
but a=[1,2,3,4,5,6]; a[0:4:2] = [7,8,9,10,11]
would transform a
in [7,2,8,4,9,6,10,11]
?
if we ever want to change an array, we can always do a = [...a[0:i], ...a[i+1:]]
for example to remove ith item. Having only Array @@slice
and Range @@geti
could be simpler (and it'd work better with 'read-only' strings)
But I admit with slice only we can't do the second example (insert items every step), so I'm neutral for @@splice
/@@seti
Could it work in destructuring? like so:
const a = [1,2,3,4,5];
const {[0:-1]: a1, [a.length-1]: last} = a;
// a1 == [1,2,3,4]
// last == 5 // this already works
C# 8 has added ranges and indexes, which includes both syntax and types for these behaviors:
Index
type represents a position relative to the start or end of an indexed collection. The ^n
syntax is shorthand for new Index(n, true)
.Range
type represents a start and end Index
within an indexed collection. The x..y
syntax is a shorthand for new Range(x, y)
.I'll start writing a babel plugin for it
I did a polyfill with acorn: https://github.com/brigand/jellobot/pull/31/files#diff-a1284a77ff99b45ce588591eddda54a9, I'll try with babel later
In light of #30, I've been tinkering with what this might look like in ECMAScript: https://gist.github.com/rbuckton/174b02d2a43573627201f8057701044c:
Index
built-in object that can be used to compute an index relative to the start or end of a collection.^n
syntax as a shorthand for new Index(n, "end")
Interval
built-in object that can be used to compute the start, end, and step for a collection.(m:n)
and o[m:n]
syntax (as well as (m:n:s)
and o[m:n:s]
for a custom stepping value).@@geti
symbol for an "inverted-get": a[b]
--> b[@@geti](a)
@@seti
symbol for an "inverted-set": a[b] = c
--> b[@@seti](a, c)
@@indexedGet
symbol used to define a method to get a value based on an Index
.@@indexedSet
symbol used to define a method to set a value based on an Index
.@@slice
symbol used to define a method to get values based on an Interval
.@@index
symbol used to define the method on an Index
used to calculate the actual index based on a provided length
.@@interval
symbol used to define the method on an Interval
used to calculate the actual start/end/step based on a provided length
.The @@index
and @@interval
symbols provide a mechanism to calculate an actual index or interval based on a provided length
. This would allow us to define an arbitrary endpoint like ^1
to mean "one from the end" when the "end" is not yet known.
The @@indexedGet
, @@indexedSet
, and @@slice
symbols provide an extensibility mechanism for users to implement custom collection classes and control how to determine the length
to pass to an Index
or Interval
.
Index
Example:
let ar = ["a", "b", "c", "d"];
let m1 = ^1;
// --> new Index(1, "end");
ar[m1]; // "d"
// --> m1[Symbol.geti](ar)
// --> ar[Symbol.indexedGet](m1)
// --> ar[m1[Symbol.index](ar.length)]
// --> ar[ar.length - 1]
// --> ar[4 - 1]
// --> ar[3]
// --> "d"
Interval
Example:
let ar = ["a", "b", "c", "d"];
let r = (0:^1);
// --> new Interval(new Index(0, "start"), new Index(1, "end"))
ar[r]; // ["a", "b", "c"]
// --> r[Symbol.geti](ar)
// --> ar[Symbol.slice](r)
// --> Slice of `ar` for `r[Symbol.interval](ar.length)` as ([start, end, step])
// --> Slice of `ar` for `[r.start[Symbol.index](ar.length),
// r.end[Symbol.index](ar.length),
// r.step]` as ([start, end, step])
// --> Slice of `ar` for `[0, ar.length - 1, 1]` as ([start, end, step])
// --> Slice of `ar` for `[0, 4 - 1, 1]` as ([start, end, step])
// --> Slice of `ar` for `[0, 3, 1]` as ([start, end, step])
// --> ["a", "b", "c"]
Host engines like V8 could choose to optimize code paths during compilation to remove the reification of Index
and Interval
types at runtime.
(edit: switched from Range
to Interval
)
What's the advantage of 0:^n
over 0:-n
or 0:
(undefined
endIndex to represent ^0
)?
Is your idea to completely avoid this notation for strings, since assignment expressions wouldn't make sense for them (we can also add the arguments of https://github.com/tc39/proposal-slice-notation#should-we-ban-slice-notation-on-strings)?
I feel like it'd be good to avoid introducing new built-in objects (to reduce the "cost" and complexity for this proposal), I thought about a Slice
at some point, but it's possible to handle the range syntax, and range expression by an engine, without any additional built-in. Like ArrowFunctionExpression for example, there isn't any constructor, or like many other operators.
I implemented slice-notation/slice-expression in https://github.com/engine262/engine262/pull/89/files#diff-7a3164ab8de945e8bd82f29aa3f3b300R10-R27
It should actually be this (using Symbol.slice
#1):
if (expression.type === 'SliceExpression') {
let start, end, step;
if (expression.startIndex){
const startPropertyRef = yield* Evaluate(expression.startIndex);
start = Q(GetValue(startPropertyRef));
}
if (expression.endIndex) {
const endPropertyRef = yield* Evaluate(expression.endIndex);
end = Q(GetValue(endPropertyRef));
}
if (expression.step) {
const stepPropertyRef = yield* Evaluate(expression.step);
step = Q(GetValue(stepPropertyRef));
}
const bv = Q(RequireObjectCoercible(baseValue));
const slice = Q(GetMethod(bv, wellKnownSymbols.slice));
// #sec-call
return Call(slice, Value.undefined, [start, end, step]);
}
It's slightly limited compared to a Slice
built-in object or an Interval
built-in like you propose only for something like:
arr[(() => (0:2))()] // would not be like arr[0:2]
// it'd be like arr[ToString((() => (0:2))())] rather
Because we don't evaluate/resolve the SliceExpression like we could with a built-in object, but I don't think it's an issue, this feature is intended to be used 'statically'
One motivator for ^1
over -1
is that ar[-1]
already has a meaning in ECMAScript, while ar[^1]
does not. You could also conceivably use it with other APIs (i.e. text.indexOf("a", ^3)
).
Well true, I don't it's possible to extend .indexOf
to handle a negative startIndex because of backward-compatibility
There's https://github.com/keithamus/proposal-array-last proposing an arr.lasItem
arr.lastIndex
, but that's not really practical
I like this ^n
idea, and I also think we could avoid a built-in Index
object for it, and similarly to what I did, only have syntax for it, and evalutate it in context (only MemberExpression, elsewhere it doesn't really make sense)
EDIT: it seems .indexOf
already work with negative indexes:
[...'banana'].indexOf('a', -3)
// 3
but only for Array.prototype.indexOf
'banana'.indexOf('a', -3)
// 1
but there are String.prototype.lastIndexOf
, Array.prototype.lastIndexOf
for those cases
Is your idea to completely avoid this notation for strings, since assignment expressions wouldn't make sense for them [...]
Given the feedback in this thread, @@splice
seems to be off the table for now. The upside of the approach I outlined WRT strings is that a String could control how a relative "end" is applied:
@@slice
to call @@interval
on the supplied interval with length
.If slice is based on code points, then we would instead define @@slice
to call @@interval
on the supplied interval with the number of code points in the string.
CodePointInterval
and CodePointIndex
@@codePointInterval
and @@codePointIndex
symbol methods to give you fine grained control over the behavior:
text[0:^1] // slice via code units
text[new CodePointInterval(0, ^1)] // slice via code points
// helper to convert code unit to code point for index/interval: const cp = { [Symbol.index]: index => new CodePointIndex(index.value, index.end), [Symbol.interval]: ival => new CodePointInterval(ival.start, ival.end, ival.step) };
text[cp[^1]] // last code point text[cp[0:^1]] // string except last code point
Yes, I agree, and I'd still prefer to handle those cases without additional built-ins (or at least less additional built-ins)
We can also think of BigInts: 0n:2n:1n
and they'd work in my implementation, without defining new built-ins (only Symbol.slice
actually, and possibly also Symbol.index
if we go for ^n
syntax)
The specific behavior for String you described, will be inlined in String.prototype[Symbol.slice]
, and it'd be overridable if needed (extending String, I don't know if it's a good idea though)
@rbuckton Any update?
It seems there are too many things we want to add, maybe we can minimize them and write a separate proposal? For example, we can first specify
^1
)0:^1
)x[^1]
syntax and semantic for Array, TypedArray and Stringx[0:^1]
syntax and semantic for Array, TypedArray and String and leave all other things like ^a
syntax, Index
, a:^b
syntax, IndexRange
(Interval
) and symbols to follow-on proposals.
I think the idea was to desugar the reverse index syntax and the index range syntax to Index
and IndexRange
so they would come together
It would be great if you could specify how the slice notation should apply to an object, perhaps via a
Symbol.slice
:Then syntax like this:
Becomes this at runtime:
The advantage of this is that we can specify the syntax in terms of a method, which allows us to specify the behavior of the slice notation on strings to work over code points rather than characters, and the behavior of the slice notation on typed arrays.
In addition, users can define how the slice notation applies to their own classes: