UnionIterator skips not readable subiterators

RubenVerborgh / AsyncIterator

An asynchronous iterator library for advanced object pipelines in JavaScript

https://rubenverborgh.github.io/AsyncIterator/docs/

Other

48 stars 7 forks source link

UnionIterator skips not readable subiterators #106

Open maartyman opened 5 months ago

maartyman commented 5 months ago

The UnionIterator is used in the bind join in comunica (and incremunica). During some performance tests, I saw that the UnionIterator called read() on all its sub-iterators, even if only one was readable. This PR fixes this by adding a check before calling read() on the sub-iterators. The sub-iterators need to be read to start them, that is why I added the read attribute to the _sources. To be fair, I think there is a better way to check if the sources have started and if not call read() on them.

maartyman commented 5 months ago

I'm sorry, in my hastiness I forgot the skip check in the _read() function.

maartyman commented 5 months ago

Also, why is does the UnionIterator extend the BufferedIterator?

RubenVerborgh commented 5 months ago

why does the UnionIterator extend the BufferedIterator?

Short answer: so readers have to wait less (it fills up in the background).

Longer answer: buffering should've been mix-in functionality rather than inherited functionality.

maartyman commented 5 months ago

Okay, I'm asking because I see a performance increase if I use a non buffered UnionIterator. But this is just in my tests with querying local stores with incremunica, when using online sources this might be different.

RubenVerborgh commented 5 months ago

We can give it a go if the performance effect extends to other cases. An easy fix could be to set the buffer size to zero. Perhaps there's already sufficient intermediary buffering happening in your case.

jeswr commented 5 months ago

There is a closed PR https://github.com/RubenVerborgh/AsyncIterator/pull/81 which does it and has significant improvements - it was meant to be bundled into a bugger update that I never got around to.