Open maartyman opened 5 months ago
I'm sorry, in my hastiness I forgot the skip check in the _read() function.
Also, why is does the UnionIterator extend the BufferedIterator?
why does the UnionIterator extend the BufferedIterator?
Short answer: so readers have to wait less (it fills up in the background).
Longer answer: buffering should've been mix-in functionality rather than inherited functionality.
Okay, I'm asking because I see a performance increase if I use a non buffered UnionIterator. But this is just in my tests with querying local stores with incremunica, when using online sources this might be different.
We can give it a go if the performance effect extends to other cases. An easy fix could be to set the buffer size to zero. Perhaps there's already sufficient intermediary buffering happening in your case.
There is a closed PR https://github.com/RubenVerborgh/AsyncIterator/pull/81 which does it and has significant improvements - it was meant to be bundled into a bugger update that I never got around to.
The UnionIterator is used in the bind join in comunica (and incremunica). During some performance tests, I saw that the UnionIterator called
read()
on all its sub-iterators, even if only one was readable. This PR fixes this by adding a check before callingread()
on the sub-iterators. The sub-iterators need to be read to start them, that is why I added theread
attribute to the _sources. To be fair, I think there is a better way to check if the sources have started and if not callread()
on them.