tc39 / proposal-pattern-matching

Pattern matching syntax for ECMAScript
https://tc39.es/proposal-pattern-matching/
MIT License
5.5k stars 89 forks source link

Do we need iterator/object property caching? #324

Open littledan opened 6 months ago

littledan commented 6 months ago

Is it necessary to cache object properties and iteration results? I believe this will be pretty difficult to optimize out in engines and transpilers without sacrificing correctness, so it may be a real performance cost at runtime. The "access multiple times" semantics seem pretty intuitive/direct to me.

I understand that there is some concern about exhausting iterators. However, array destructuring is for iterables--you always get a fresh call to [Symbol.iterator] before something else happens, and you could do your caching in there when it's important to you. I'm not sure if we should be defending against usage patterns where iteration is destructive.

(I thought I had an issue thread about this, where I was worrying about soundness/consistency, but the current Map approach is consistent; anyway, I can't find that old thread, so filing a new one.)

ljharb commented 6 months ago

It’s discussed a lot in #216.

tldr, yes - without the caching, you can’t match against the first N items and then the first N + 1. iterables aren’t reusable either - there’s no guarantee that iterating twice is possible.

Jack-Works commented 6 months ago

I agree cache is slow and hard to implement. I think it's possible to remove them to speedup common cases and break for special objects & iterators.

littledan commented 6 months ago

@ljharb I understand that argument and I think performance and simplicity of common cases is the concern of overriding importance.

ljharb commented 6 months ago

For the "iterators aren't reusable" piece, yes. Are you suggesting it's reasonable, in this example, to create three separate iterators from the iterable?

match (arr) {
  when [a]: …;
  when [a, b]: …;
  when [a, b, …]: …;
}
littledan commented 6 months ago

Yes.

ljharb commented 6 months ago

This would mean that pattern matching can't ever safely take an iterator - including something from iterator helpers, like match (arr.values().map(m).take(n)) {. That seems like it'd be a pretty common case.

tabatkins commented 6 months ago

Right, having to eagerly cast such things into an array seems... unfortunate.

jaskp commented 6 months ago

This would mean that pattern matching can't ever safely take an iterator - including something from iterator helpers, like match (arr.values().map(m).take(n)) {. That seems like it'd be a pretty common case.

@ljharb What makes such a case seem common when iterator helpers haven't landed yet and iterators are generally rarely used in JS? And basically all functions typed to accept sequences of values in TypeScript expect materialized arrays, not iterators, which I believe would slow down adoption even further. Rust and Python can't match on iterators either, can they?

Right, having to eagerly cast such things into an array seems... unfortunate.

@tabatkins Can you please elaborate on why you consider match ([...arr.values().map(m).take(n)]) {} unfortunate?

ljharb commented 6 months ago

@jaskp iterator helpers are polyfillable and are currently shipped in Chrome.

That's unfortunate because it's creating an array unnecessarily, and it's extra boilerplate that users will have to type.

Jack-Works commented 6 months ago

what if we eagerly consume the iterator at the first time when we need an array match?

jaskp commented 6 months ago

@jaskp iterator helpers are polyfillable and are currently shipped in Chrome.

That's unfortunate because it's creating an array unnecessarily, and it's extra boilerplate that users will have to type.

@ljharb is there some example of when you would want to match against an iterator? I can't really think of such a situation. I expect most matching to be done on POJO's and tuple-like arrays to discriminate between shapes. The only matching that currently makes sense to me in the context of an iterator is something like this

someIterable.values().map(elem => match (elem) { ... })

So it's a bit difficult for me to imagine a situation where it's wise to match against an iterator, I might have missed something in the discussion though. Perhaps the use-case is something JS specific and that's why Rust and Python don't need it?

And if some iterator really wanted to be matched against, couldn't that be implemented in the iterator itself and moved outside of this proposal? Iterators are stateful after all.

ljharb commented 6 months ago

@jaskp there are tons. arrays, maps, and sets all have .keys(), .values(), .entries(), specifically so you can choose which kind of iterator you want. The Iterator.range proposal will create an iterator as well, as will all of the iterator helpers.

"iterator" is a first-class thing - "iterable" is not.

jaskp commented 6 months ago

there are tons

@ljharb can I see one? (one situation where you should be matching on an iterator)

ljharb commented 6 months ago

I just named a few. When i want to check a map’s values, instead of the entries, for example.

jaskp commented 6 months ago

@ljharb Sorry for the misunderstanding, I meant an example of a specific situation, such as the "motivating examples" in the proposal.

Basically I would like to challange the notion that matching on iterators is useful, because I can't come up with a situation where it would be useful, so it seems like unnecessary complexity that is slowing this proposal down.

For a sanity check, I prompted ChatGPT and, although it completely hallucinated the syntax, it could only come up with a similar (counter)example as mine, i.e. matching on elements of the iterator, not the iterator itself:

function* mixedIterator() {
    yield 42;
    yield "hello";
    yield 3.14;
    yield "world";
    yield true;
}

for (const item of mixedIterator()) {
    match (item) {
        case let number:
            console.log("Found a number:", number);
            // Process the number
            break;
        case let string:
            console.log("Found a string:", string);
            // Process the string
            break;
        default:
            console.log("Found another type:", item);
            // Handle other types
    }
}

So is anyone aware of any motivation to support matching on iterators?

ljharb commented 6 months ago

Oh, maybe we’re talking past each other. I want to match on the contents of iterators; never on the iterator itself.

jaskp commented 6 months ago

Oh, maybe we’re talking past each other. I want to match on the contents of iterators; never on the iterator itself.

@ljharb No no in this regard we are on the same page 😄 I'm asking for a motivating example of why you would match on the contents of an iterator. The only single idea I have that makes sense is asserting non-emptiness, like:

function toNonEmptyArray(it) {
  return match (it) {
    case []:
        (() => throw Error())()
    case [...let items]:
        items 
  }
}

But I think this single use-case isn't at all worth all the iterator caching hassles. Do you have any other example where you want to match on the contents of an iterator? I'm just trying to understand the usefulness of being able to do this

tabatkins commented 6 months ago

The answer was already stated by Jordan - JS recognizes the concept of iterators and uses them all over the place, but iterables are a much fuzzier concept that the language doesn't officially bless. It's just "a thing that can give you an iterator", and all iterators are iterables that return themselves. You appear to be operating under the assumption that an iterable is "a thing which can repeatedly generate distinct iterators that will go over the same data", and that's not the case here.

The caching behavior aids in handling iterables, but it also helps much more widely in any case where you'd otherwise be forced to extract a value into a temp variable: expensive computations, non-deterministic values, etc. It is expected that people will commonly write multiple match arms which go over the same data in slightly different ways (checking for a length-2 array, then a length-3 array; or checking for a particular key, then a different particular key), and match much of the same items between the arms. They'll expect that these tests work efficiently and predictably, returning the same match result every time and without re-rerunning expensive computations. They don't have the ability to run arbitrary temp-variable extractions in the middle of a match block, after all.

So, the caching behavior isn't strictly necessary, no. But without it, the feature behaves much less predictably in several common ways, which we expect will cause authors trouble regularly.

Jack-Works commented 6 months ago

we can still support iterators, but drop support for infinite iterators. this will make things easier.

Currently we fetch element in iterators on demand. We can change to consuming all elements and cache it. this will make the spec simpler and still support matching with iterators.

ljharb commented 6 months ago

That won't likely achieve consensus, as nonzero delegates find working with infinite iterators critical.

jaskp commented 6 months ago

@tabatkins

The answer was already stated by Jordan - JS recognizes the concept of iterators and uses them all over the place, but iterables are a much fuzzier concept that the language doesn't officially bless. It's just "a thing that can give you an iterator", and all iterators are iterables that return themselves. You appear to be operating under the assumption that an iterable is "a thing which can repeatedly generate distinct iterators that will go over the same data", and that's not the case here.

I'm not sure why we're discussing iterables, I've never used that word in my sentences, I've only used it in a code example to make it clear that someIterable.values() yields an iterator. I'm not suggesting we cater to iterables, I'm suggesting we drop special treatment for iterators and treat it as any other object -- use its [Symbol.customMatcher], which in case of iterators will probably just do something trivial.

No programming language with pattern matching caches iterators. What is the motivation?

Isn't it a bit odd that I've asked for a concrete motivating example four times and still no one was able to provide one?

jaskp commented 6 months ago

The caching behavior aids in handling iterables, but it also helps much more widely in any case where you'd otherwise be forced to extract a value into a temp variable: expensive computations, non-deterministic values, etc. It is expected that people will commonly write multiple match arms which go over the same data in slightly different ways (checking for a length-2 array, then a length-3 array; or checking for a particular key, then a different particular key), and match much of the same items between the arms. They'll expect that these tests work efficiently and predictably, returning the same match result every time and without re-rerunning expensive computations. They don't have the ability to run arbitrary temp-variable extractions in the middle of a match block, after all.

Maybe an example would help me understand and teach me (and possibly others reading) something new?

Jack-Works commented 6 months ago

Isn't it a bit odd that I've asked for a concrete motivating example four times and still no one was able to provide one?

For me, the main reason is to keep parity with const [a, b]

I did a quick research for Array destructing on our codebase about the usage of Array destructing.

// React
const [state, setState] = useState();

// swap vars
[x, y] = [y, x];

// Map#entries
for (const [value, key] of val.entries()) {}

// String#split
const [left, right] = input.split('/');

// String#matchAll
const [[, payload] = []] = input.matchAll(/some regexp here/giu);

// Tuple-style custom data structure
const [version, network, id, alg, key, data] = item;

// Accessing function parameters
const [a, b, c] = arguments;

Pattern matching is meaningful in those cases:

The second case is an iterator, but I cannot think of an example that will look better in pattern matching.

Note: You cannot do meaningful matching with Map.entries() or other similar methods, because the match is ordered and the Map/Set's value is not sorted (but by their insertion order).

I still hope we can keep parity with Array destructing (supports iterators), we can simplify the current caching behavior but still keep it:

// currently
match ({ a: iteratorOf(0, 1, 2) }) {
    when { a: [0] }: ...
    // call a.next 2 times, cache = [0, 1, ...?]

    when { a: [] }: ...
    // call a.next 0 times, get it from the cache

    when { a: [0, 1] }: ...
    // call a.next 1 times, cache = [0, 1, 2, ...?]

    when { a: [0, 1, 2] }: ...
    // call a.next 1 times, cache = [0, 1, 2, <done>]
}

We can change it to this to simplify:

// a possible simplification but still support iterators
match ({ a: iteratorOf(0, 1, 2) }) {
    when { a: [0] }: ...
    // run [...a] and cache it

    when { a: [] }: ...
    // match like an array
}

This drops the ability to match with infinite iterators.

jaskp commented 6 months ago

// String#matchAll const [[, payload] = []] = input.matchAll(/some regexp here/giu);

Thank you @Jack-Works! That's a really good example. However, this works because matchAll() returns something that implements the iterable protocol, an iterator itself isn't enough.

You can verify that this does not work:

let [x] = {next() { return {value: 1} }} // this fails
SupaYo commented 6 months ago

Yes

devsnek commented 6 months ago

just driving by here... it doesn't make sense to me to apply matching to any object where observing it also changes it. of course this is javascript and its trivial to create such objects, but i don't think we should be catering to such objects in the design here. I would even go as far as to say that in my opinion sequence matching should not use iterators at all, and should just use length/index, or an array-based matching protocol.

Jack-Works commented 6 months ago

// String#matchAll const [[, payload] = []] = input.matchAll(/some regexp here/giu);

Thank you @Jack-Works! That's a really good example. However, this works because matchAll() returns something that implements the iterable protocol, an iterator itself isn't enough.

You can verify that this does not work:

let [x] = {next() { return {value: 1} }} // this fails

But this work:

const iter = [1,3,5]
const [x] = iter // 1
const [y] = iter // 3

iter is an Iterator, it also has a Symbol.iterator property that returns itself.

jaskp commented 6 months ago

@Jack-Works Definition:

An object is an iterator when it implements a next() method with the following semantics: [...]

[1,3,5].next is undefined, therefore [1,3,5] is not an iterator.

jaskp commented 6 months ago

it also has a Symbol.iterator property that returns itself.

This is also false

const iter = [1,3,5]
iter[Symbol.iterator]() === iter // false
Jack-Works commented 6 months ago

Sorry I mean const iter = [1,3,5].values()

tabatkins commented 6 months ago

@jaskp You seem to be splitting hairs on iterable/iterator in a way that doesn't actually matter for the discussion. The iterable protocol is used to obtain an iterator, which is actually used. Is this part of the discussion germane to any points you wish to make?

jaskp commented 6 months ago

@tabatkins I consider this distinction to be quite important. Of course, if you wish to avoid the distinction, I will replace iterators with generator functions in my arguments (since it's the most common case of an iterator that isn't an iterable and is also used in the proposal to explain the caching).

Therefore my point is: Array destructuring doesn't support generators, so what's the motivation behind supporting matching generators on array patterns?

Jack-Works commented 6 months ago

Therefore my point is: Array destructuring doesn't support generators, so what's the motivation behind supporting matching generators on array patterns?

It supports.

function* f() { yield* [1,2,3,4,5] }
const [a, b, c] = f()
console.log(a,b,c) 
jaskp commented 6 months ago

Therefore my point is: Array destructuring doesn't support generators, so what's the motivation behind supporting matching generators on array patterns?

It supports.

function* f() { yield* [1,2,3,4,5] }
const [a, b, c] = f()
console.log(a,b,c) 

Ah, you're right. Then I guess I can't make any arguments without distinguishing iterators and iterables, sorry.

tabatkins commented 6 months ago

Right, so the point is, in almost every place the language might accept an array, it also accepts an iterable (and then obtains an iterator from it; the built-in iterators are all iterables that return themselves). It would be odd for match to not work similarly.

However, most of the other uses visit the iterator items once (or zero); match is somewhat unique in making it extra easy to address the items multiple times across separate match arms. If we don't cache, this means that while match might be able to accept iterators in theory, in practice they're unusable in many common situations; you'd be forced to write match([...myMap.values()]) {...} anyway. And that kinda sucks for the authoring experience.

Property caching for object patterns then follows from that, tho with slightly less motivation: if an expensive or non-deterministic iterator can depend on its items only being pulled once and having a stable value across match arms, then objects should arguably get the same guarantees for their properties/getters.