Perf/fast unions - Githubissues

jeswr commented 2 years ago

Supercedes #79

The concerns about behavior (and reasons for a major version bump) that I pointed out there still remain. But not sure if I have a good solution for those at this point.

Before For loop with 5x10^9 elems: 5.207s UnionIterator 2x10^7 iterators: 30.297s UnionIterator 1000x500 iterators: 1.897s UnionIterator 1000x500 iterators - max parallelism of 1: 1.837s

After For loop with 5x10^9 elems: 7.832s UnionIterator 2x10^7 iterators: 3.844s UnionIterator 1000x500 iterators: 276.211ms UnionIterator 1000x500 iterators - max parallelism of 1: 157.52ms

jacoscaz commented 2 years ago

Just came back from a break week with no laptop and catching up on things. When it comes to the behaviors described in https://github.com/RubenVerborgh/AsyncIterator/pull/79#discussion_r934927797 and https://github.com/RubenVerborgh/AsyncIterator/pull/79#discussion_r934928512, I don't think that those can and/or should be addressed in UnionIterator as the latter can not make any assumption on how sources are generated in the first place.

In the case of sources being lazily generated in an asynchronous manner, ensuring correct error propagation and disposal of sources would be best addressed in a dedicated sources iterator that is bound to be context-specific. The only thing that UnionIterator should do is to catch errors both at the single source level and at the iterator-of-sources level (which is already happening at https://github.com/RubenVerborgh/AsyncIterator/blob/6e7f093364bef006db6b06136aeb80eafce82d64/asynciterator.ts#L1668-L1673).

IMHO, when it comes to error propagation and source disposal this PR is already complete as it is.

jeswr commented 2 years ago

In the case of sources being lazily generated in an asynchronous manner, ensuring correct error propagation and disposal of sources would be best addressed in a dedicated sources iterator that is bound to be context-specific.

So then I guess the only remaining question for this PR is when an array of sources is passed into the UnionIterator whether we should immediately start listening for the error event by subscribing that listener in the constructor.

jacoscaz commented 2 years ago

So then I guess the only remaining question for this PR is when an array of sources is passed into the UnionIterator whether we should immediately start listening for the error event by subscribing that listener in the constructor.

I don't think so as that would lead to slightly different behaviors in the case of sources passed as an array vs. sources passed as an iterator. Wrapping the array into an iterator with wrap() and consuming it as if it had always been an iterator leads to more consistent behavior.

jeswr commented 2 years ago

Ok great - in that case I am happy with the status of this PR and consider it ready for review :)

jacoscaz commented 2 years ago

Actually, I realized you were likely making a point that I failed to grasp. If sources are passed as an array of iterators that is fed to wrap(), the resulting iterator will not relay errors from the sources themselves as there's nothing in wrap() that handles that specific case (rightly so as wrap() should not modify its behavior based on item type).

In this case I would either leave things as they are right now or create a different iterator class that is specific to the use case of abstracting an array of sources into an iterator. Then, the UnionIterator constructor could do something like:

if (Array.isArray(sources)) {
  this._sources = new SourceArrayIterator(sources);
} else { ... }

However, the implementation of SourceArrayIterator would be a non-trivial one:

1) should there be an option to simply discard errored sources? 2) should an errored source automatically lead to the destruction of other sources?

These questions seem very context-specific to me and I'm sure I could come up with plenty more. Perhaps it's worth thinking of SourceArrayIterator as an orthogonal feature to UnionIterator and better addressed in a different PR if needed at all.

jeswr commented 2 years ago

Perhaps it's worth thinking of SourceArrayIterator as an orthogonal feature to UnionIterator and better addressed in a different PR if needed at all.

So to conclude this PR is ready to be reviewed in its current state.

Bump @RubenVerborgh

RubenVerborgh commented 2 years ago

@jeswr Some questions about the circular list:

Shall we extract CircularList to a separate file like LinkedList?
Is a CircularList strictly needed? I.e., do the performance gains really come from not having an array for the iterators?

I do agree that, conceptually, it is indeed a circular list though.

Given that the number of iterators is assumed to be small (10–20ish?) and not often changed (?), I am curious if there is a big impact.

reasons for a major version bump

So this is a v4? (I followed the links, but no result for keyword major.)

jeswr commented 2 years ago

Shall we extract CircularList to a separate file like LinkedList?

That's probably sensible. When I was first writing I didn't have clarity on what APIs would need to be exposed; but I can now try and refactor that retrospectively.

Is a CircularList strictly needed? I.e., do the performance gains really come from not having an array for the iterators?

The reason I have introduced it is that we need to always be able to delete an element in O(1) time when the end event for that particular iterator is called; which we cannot do with an array or the current LinkedList as both require the traversal of O(n) elements (where n is index of the element in the array/linkedlist).

The other type of data structure that would allow for this type of deletion would be a Doubly LinkedList, which could become part of the current linkedlist implementation, at a memory cost to other iterators.

So this is a v4? (I followed the links, but no result for keyword major.)

Yes, since the erroring behavior will change in some cases this will need to be a v4.

RubenVerborgh commented 2 years ago

both require the traversal of O(n) elements

I agree; but it seems to be that it would be O(10) at most; unless we are looking at scenarios where the number of iterators is really large?

I guess I'm simultaneously asking: what are the main sources of performance improvement in this PR? Is it indeed this new list structure?

rubensworks commented 2 years ago

I agree; but it seems to be that it would be O(10) at most; unless we are looking at scenarios where the number of iterators is really large?

Comunica has some cases where the number of iterators in a union could become very large (property paths and link traversal). However, currently those cases do not make use of the UnionIterator yet, so perhaps not necessary to optimize for those.

RubenVerborgh commented 2 years ago

Okay; I'm happy to have a circular list.

(Although, side-note, what we actually need is a list with O(1) item removal. Which can perfectly be a LinkedList with lazy deletion: you keep a Set of items to be deleted, and as you iterate through elements, you remove nodes as needed. But we might be getting into premature optimization territory here 😅.)

(Actually, scratch that. There might be multiple occurrences of an item—at least for lists in general.)

(But do note that you don't need a doubly linked list. All we need is a Map from item to previous list node; which is currently implemented via Symbol. Also the current implementation breaks if an element is added multiple times to a list; which is not allowed for iterators, but it is for lists in general.)

(So my points still hold if we simply don't allow duplicate items.)

Maybe OrderedSet extends LinkedList, adding O(1) has and delete operations via a Set or Map? Then we also keep the LinkedList without the Map memory and perf overhead.

And there could be a .looped iterator that is infinite/circular!

jeswr commented 2 years ago

Maybe OrderedSet extends LinkedList, adding O(1) has and delete operations via a Set or Map?

I'll have a crack at this in the morning (jet lag is starting to hit me hard). To confirm - this would be preferred over a circular linked list implementation (provided this has similar performance to the current implementation, which it should)

RubenVerborgh commented 2 years ago

this would be preferred over a circular linked list implementation

Not necessarily; my remark was more that all we need is O(1) deletion.

Circular lists as such doesn't offer that; it's first an O(n) lookup, and only then deletion is O(1). Which is why you added the Symbol lookup to make it O(1).

So what I'm arguing is that the Symbol lookup trick is basically what we need (and that we can also implement it with a Map); and that we get O(1) deletion by making it point at the previous node rather than current node.

RubenVerborgh commented 2 years ago

Yes, since the erroring behavior will change in some cases this will need to be a v4.

Could you details these a bit?

Because we should line up all the changes we want for v4, and see which ones we can bundle.

jeswr commented 2 years ago

Could you details these a bit?

See https://github.com/RubenVerborgh/AsyncIterator/pull/79#discussion_r934929002. The reason that test broke is because we do not subscribe to the error event of iterators until after we start reading from them. So if the iterator errors before we first call .read() on it, then the error will not be forwarded.

Though, if we remove pretty well all autostarting behavior in #45 - then there really shouldn't be errors being emitted prior to the first .read() call anyway.

RubenVerborgh commented 2 years ago

we do not subscribe to the error event of iterators until after we start reading from them

Can't we change that behavior? Not just for backwards compatibility reasons, but it's a necessity for correct functioning.

The moment an iterator is handed over to a component, that component takes full responsibility of that iterator. That includes handling its errors, which would terminate the process when unhandled (default EventEmitter behavior).

So in this concrete case, when a source iterator is handed over to a union iterator, that union iterator should immediately attach error listeners.

jeswr commented 2 years ago

Can't we change that behavior?

Just to make this even more concrete; if I have

const source = range(1, Infinity).map(x => range(1, 10))

const unionRange = union(source)

Then source will have an error listener attached immediately, but each of the infinitely generated range(1, 10) iterators will not have an error listener attached until immediately before the first .read() call. In this case, and cases with a large number of generated iterators it does not make sense to try and retrieve them all from the source and attach error listeners before we start being able to iterate over unionRange.

This is still breaking relative to the last version as the fact that the UnionIterator could autoStart and buffer could mean that listeners are attached at an earlier point in time.

The one case in which it may make sense to attach error listeners to all source iterators in advance is if we are taking a union over an array of iterators rather than an iterator of iterators; i.e.

const unionRange = union([range(1, 10), range(1, 10)])

But in the same token; I'm not sure if it makes sense to change this behavior when taking the union over an iterator vs. the union over an array.

jacoscaz commented 2 years ago

The only way to attach error handlers to a union of lazily-generated sources would be to let upstream instantiate those sources and buffer all of them, even though we might need only a few such as in the case of destroying the union after a set number of items. That would likely have significant performance and resource utilization issues.

But in the same token; I'm not sure if it makes sense to change this behavior when taking the union over an iterator vs. the union over an array.

This is why I suggested using a separate class to handle that specific use case in https://github.com/RubenVerborgh/AsyncIterator/pull/81#issuecomment-1216802882 - to keep the UnionIterator class as consistent as possible.

jeswr commented 2 years ago

Closing as I have a better implementation in the upcoming v4 PR.

RubenVerborgh / AsyncIterator

Perf/fast unions #81