tc39 / proposal-collection-methods

https://tc39.github.io/proposal-collection-methods/
Other
172 stars 8 forks source link

Set.prototype.at #51

Open jsumners opened 1 year ago

jsumners commented 1 year ago
const aSet = new Set([1, 2, 3, 4])
console.log(aSet.at(1)) // 2
// Would be _really_ nice:
console.log(aSet[1]) // 2
bergus commented 1 year ago

Sets are not an integer-indexed data structure. While they have a deterministic iteration order, they are representing unordered collections (in the sense that you cannot access them by position, or reorder/sort them).

jsumners commented 1 year ago

The fact that they are iterated in insertion order is enough for my desire. mySet.at(0) is first item inserted, mySet.at(1) the second, and so on.

bergus commented 1 year ago

Then iterate your set. An .at(n) method would suggest sublinear time complexity to access the nth element, which is not achievable. You can use find from this proposal

 console.log(aSet.find((_, i) => i == 2))

or the iterator helpers

console.log(aSet.values().drop(2).next().value)

to log the third element. This should not be easy, the syntax should make it clear that the set is iterated.

bakkot commented 1 year ago

An .at(n) method would suggest sublinear time complexity to access the nth element, which is not achievable.

Eh, it's a little more complicated than that - the underlying data structure in fact gives you constant time random access unless you've deleted elements, though once you start deleting things it becomes much more complicated.

But since it's not always constant time, at probably isn't the best name for such a method, yes.

oleedd commented 2 months ago

Sets are not an integer-indexed data structure. While they have a deterministic iteration order, they are representing unordered collections (in the sense that you cannot access them by position, or reorder/sort them).

Not really. image So these Entries may be used.

bakkot commented 2 months ago

The devtools display in a given browser is not indicative of much of anything. Devtools are optimizing for displaying things to users, not reflecting underlying reality.

If you look at the actual implementation used in Chrome and Firefox it does not support constant time indexing after deleting things. Safari uses a different implementation (a hashmap plus a linked list) but it doesn't support constant time indexing either.

oleedd commented 2 months ago

Firefox too: image So it is not a custom thing.

ljharb commented 2 months ago

@oleedd it's still a custom thing even if every browser chooses to display it the same.

oleedd commented 2 months ago

Then this custom thing may be used. Although it seems to me that this is specified in some standard (maybe not about sets, but about "Entries"). This doesn't seem like just a coincidence.

bakkot commented 2 months ago

The custom thing involves reading through the entire set, or at least some subset of it, for display. Which is fine for rendering it but not fine for random-access indexing. If you don't want to take my word for it you can inspect the data structures yourself; these engines are all open-source, and I've already linked a good summary of the structure used in Chrome and FF. You are not going to convince anyone that this is possible by any means other than demonstrating that the data structures used to implement these in engines somehow do actually support random-access indexing, and pointing to the rendering in the devtools does not demonstrate this.

As to the similarity between Chrome and FF, devtools often copy each other, or converge to similar endpoints, because users appreciate the familiarity. It is not a coincidence that they have similar renderings, but neither is it standardized.

oleedd commented 2 months ago

To remember insertion orders, they have to use indexes inside actually. And these internal indexes are used for iteration.

bakkot commented 2 months ago

That is not true. In some implementations there are indexes involved, but after items have been deleted these indexes do not correspond to the indexes you'd get from iterating in userland, so they're not useful here. Simply asserting otherwise does not make it so. They are open source; you can see for yourself.

I don't think it's really useful for me to keep repeating myself, so I'm going to stop responding now.

oleedd commented 2 months ago

Those internal indexes may be these "Entries", which change after deleting as well. Saving those empty slots and skipping them for creating entries for iteration is not very smart. I can't check because I don't have experience with c++.

oleedd commented 2 months ago

I just gave a potential way to help. I didn't intent to give irrefutable proof. It is rather necessary to prove impossibility than possibility.

oleedd commented 2 months ago

If some engines remain empty slots, they can rebuid internal indexes after deleting, how it is going with splice(). Or to use a flag to mark that deleting was and rebuild at using at() if it was to reduce the number of rebuilds.

oleedd commented 2 months ago

Even if at() will work as long as has(), it is still worth.

zeel01 commented 1 month ago

There are many iterable data structures that aren't indexable, the most obvious and simplest being a linked-list, in which each item knows where the next item is, but the data structure itself only knows where the first item is. You can only access the nth item by iterating over n-1 previous items in order to find it. While Set is implemented in a more clever way than that, the same general principal applies: You can't always find the item at a specific location without first finding the item before it, and you may not be able to find that item without finding the one before it... and so on. In the worst case, you may have O(n) time.

You can of course already do this in O(n) by iterating through the Set until the nth element with a simple for loop. It's not compact or nice, but the advantage of it is that it's clear exactly how slow it is just by looking at it, while if .at(n) exists, it will hide a potentially poorly performing implementation behind a seemingly simple method.

As for devtools, I suspect it's basically just doing [...set] then displaying it nearly the same way it displays an array. This doesn't need to be fast, since it only needs to be able to do within an animation frame to appear instant when you hit enter on the console. But if you needed to index into a set many many times, you would start to feel how much slower it is. You could also of course do [...set][n] to implement .at(n) functionality in a relatively concise way. It's ugly, but would get the job done. That being said, the question remains: What is a concrete use case for this functionality that isn't better served by using an array in the first place?