Open jeswr opened 2 years ago
Indeed, chained buffering steps might explain the slowdowns I’ve seen. Normally I would expect performance to slow down linearly with the number of chained transforms but often I see slowdowns that seem somewhat exponential. I had been under the impression that methods such as map()
and filter()
behaved as you described but, given this issue, I was either plain wrong or possibly remembering a previous version. This would be a great addition to asynciterator
itself but we could do it separately if better for backward compat.
However, I do not have hard numbers at hand. @jeswr have you done some profiling on this?
@jeswr Do you think think the slowdown is caused by the complexity of the code in readAndTransformSimple
, or the fact that it fills an internal buffer? If the latter, just disabling the buffer on intermediary iterators might also improve overall performance.
@rubensworks I'd suspect you're correct in that the intermediary buffers are the main overhead.
I was thinking that one way around this is to have a 'chained transform' which applies multiple map/filter/transform operations in one go, so long as each part of that chain is synchronous - is this equivalent to what you mean by disabling internal buffers?
@jacoscaz - no I have not done profiling, it was just an observation I made when digging through the code when working on a PR I made earlier
I was thinking that one way around this is to have a 'chained transform' which applies multiple map/filter/transform operations in one go
Would this not require knowing what those transformations look like ahead of time? Unless, perhaps, one were to have all intermediate actors return a transform function and have the final actor concatenate these.
However, if all intermediate steps were to be implemented as iterators without internal buffering as you suggested, provided an initial buffered iterator for I/O queries (i.e. reading from a file), the resulting performance should not be too far from that of a pure concatenation of functions while allowing to keep the current internal structure.
We should come up with a couple of examples that clearly demonstrate whether there’s a real issue here or not. One possible way of doing this would be to manually implement a no-buffering map iterator and profile it against the current map()
. I’d be up for that if nobody has time to do so, there’s potential for a great return of investment.
Would this not require knowing what those transformations look like ahead of time?
I was thinking that methods like map
, filter
and transform
would add to the internal set of operations that need to be done and then return the same iterator. The major caveat of this being that this would affect upstream reads of the iterator if there are any
I’d be up for that if nobody has time to do so
That sounds great but I'd wait for #43 to be merged as there may introduce some small performance changes
I was thinking that methods like map, filter and transform would add to the internal set of operations that need to be done and then return the same iterator.
I suspect that the concatenation mechanism might be a non-trivial one, and it would have to be instantiated even for single-op iterators unless we were to make it a lazy one, which would make it more complicated. Nonetheless, I think there’s definitely something to be said for combining operations. For example, could Comunica benefit from a mapFilter()
iterator?
In any case, we should probably look at these as orthogonal issues:
1) on one side we have buffering, and the possibility that disabling buffering in intermediate iterators might result in gains when dealing with long-ish chains;
2) on a different side we have combining N transformations into M < N iterators, and the possibility that doing so might reduce avg chain length.
Both definitely sound worth investigating IMHO.
I'd wait for https://github.com/RubenVerborgh/AsyncIterator/pull/43 to be merged as there may introduce some small performance changes
Seems reasonable! I may start setting up a simple benchmark and then pause and wait for that PR to be merged.
Is the SimpleTransformIterator
being used in performance-sensitive code?
My goal for it was to be used for simple cases—and definitely not for chaining. I guess I should have been more explicit about that.
For chaining, you want to define your own sync or async transform iterator.
Because of this the use of .map
, .filter
, .transform
etc. all create a new SimpleTransformIterator
and hence there are many places in Comunica that create new SimpleTransformIterator
s for instance
Mapping:
(and thats only going up to page 3/10 in searching for .map
in https://github.com/comunica/comunica/search?p=3&q=.map)
Filtering:
.transform
over .filter
@rubensworks )Given how commonly this is used I think it is worth making the change upstream here in this package
Absolutely, thanks for finding this. We probably want:
MapIterator
, FilterIterator
, …SimpleTransformIterator
that, if only map
or filter
are requested, one of the dedicated iterators is instantiated instead.@jeswr could you have a look at this gist of mine? https://gist.github.com/jacoscaz/e20812991714092dbb20cfd432d1bf00
Either I got something wrong or there's virtually no performance difference between .map()
and a dedicated mapping iterator.
To summarize the results I have in the following comments
ArrayIterator
causes about a 5x speed degradation compared to range
iterator on 20000 elements (because as I raised here the V8 implementation of Array.shift
is still very suboptimal).range
iterator on 20000 elements and using custom MapIterator
/ FilterIterator
s we see approx. an 8x improvement.end
event straight away where possible creates a further 10% improvement.) CompositeTransform
that reduces over all of the maps and filters causes a further 10-15% performance improvement when compared to simply creating 50 new Map
and Filter
iteratorsCompositeTransform
and observe a 50%-100% performance gainCombining all of these factors can result in a speed-up of 100x or greater when doing 50 map
/filters
on 20000 elements; and similar results when we do smaller numbers of map
/filters
(e.g. 5).
This gist includes some experiments and results that are discussed in the following comments.
@jacoscaz - The vast majority of your time is spent on the array iterator because of https://github.com/RubenVerborgh/AsyncIterator/issues/38; I believe node does not implement similar optimizations to firefox. (@RubenVerborgh for reference; my machine took about 5x times longer to run the implementation @jacoscaz had using the array iterator).
I get the following results for the code I have attached below
ArrayIterator#map(): 145
MappingIterator: 19
if you do 50 mappings the results become
ArrayIterator#map(): 1005
MappingIterator: 143
now here is where it gets more interesting - if we use the array iterator for 50 mappings we get
ArrayIterator#map(): 5321
MappingIterator: 4698
So this means that with a combination of optimizing the ArrayIterator
and using the MappingIterator
there is a 35-40x performance increase to be had for 50 chaining operations (and for 5 chaining operations the problems of the ArrayIterator
become even more pronounced; observing a 200x difference in results).
import {AsyncIterator, range} from 'asynciterator';
class MappingIterator<I, O> extends AsyncIterator<O> {
constructor(source: AsyncIterator<I>, map: (item: I) => O) {
super();
source.on('end', () => {
this.close();
});
source.on('readable', () => {
this.emit('readable');
});
let item: I | null;
this.read = (): O | null => {
return ((item = source.read()) !== null)
? map(item)
: null;
};
}
}
const time = (createStream: (max: number) => AsyncIterator<number>): Promise<number> => {
const max = 200000
const str = createStream(max);
let count = 0;
return new Promise((resolve, reject) => {
const now = Date.now();
str.on('data', () => {
count += 1;
})
.on('end', () => {
const then = Date.now();
if (count != max + 1) {
reject(new Error('Bad count'));
return;
}
resolve(then - now);
});
})
}
const main = async () => {
const mapMethodTime = await time((max) => {
let iterator: AsyncIterator<number> = range(0, max);
for (let i = 0; i < 5; i++) {
iterator = iterator.map(item => item);
}
return iterator;
});
const mappingIteratorTime = await time((max) => {
let iterator: AsyncIterator<number> = range(0, max);
for (let i = 0; i < 5; i++) {
iterator = new MappingIterator(iterator, item => item);
}
return iterator;
});
console.log(`ArrayIterator#map(): ${mapMethodTime}`);
console.log(`MappingIterator: ${mappingIteratorTime}`);
};
main().catch((err) => {
console.error(err);
process.exit(1);
});
Environment:
npm v8.4.1
node v17.4.0
Run on Dell XPS15 with 32G RAM
With the PR (https://github.com/RubenVerborgh/AsyncIterator/pull/45) I have made I get; for the ArrayIterator
method
ArrayIterator#map(): 5532
MappingIterator: 4770
and for the range
method I get
ArrayIterator#map(): 944
MappingIterator: 126
As a summary, using the range
iterator and doing 50 transformations we get
Current Version | New PR | |
---|---|---|
ArrayIterator#map() |
1005 | 944 |
MappingIterator |
143 | 126 |
To do this I used the following MapIterator
class MappingIterator<I, O> extends AsyncIterator<O> {
constructor(source: AsyncIterator<I>, map: (item: I) => O) {
super();
source.on('end', () => {
this.close();
});
source.on('readable', () => {
this.emit('readable');
});
let item: I | null;
this.read = (): O | null => {
const item = source.read();
if (item === null) {
// @ts-ignore
if (this._state === CLOSED)
this._end();
return null
}
return map(item);
};
}
}
I've also implemented a FilterIterator
against my PR (below), for 50 iterations on range
the results are
ArrayIterator#filter(): 72
FilterIterator: 13
class FilterIterator<I> extends AsyncIterator<I> {
constructor(source: AsyncIterator<I>, filter: (item: I) => boolean) {
super();
source.on('readable', () => {
this.emit('readable');
});
let item: I | null;
this.read = (): I | null => {
let item;
while ((item = source.read()) !== null) {
if (filter(item))
return item;
}
// @ts-ignore
if (source._state === ENDED || source._state === CLOSED)
this._end();
return null;
};
}
}
Sorry to keep spamming this thread - I've also implemented a CompositeIterator
and it also shows non-trivial performance gains relative to just repeatedly applying the MapIterator
and FliterIterator
(~10-15% improvement)
ArrayIterator#map(): 894
MappingIterator: 105
CompositeIterator: 90
import { CLOSED, ENDED } from 'asynciterator';
import { AsyncIterator, ArrayIterator, range } from './asynciterator';
type Transform = {
type: 'filter';
function: (elem: any) => boolean
} | {
type: 'map';
function: (elem: any) => any
}
class CompositeMapFilter<T> extends AsyncIterator<T> {
private tranforms: Transform[] = [];
constructor(private source: AsyncIterator<T>) {
super();
source.on('readable', () => {
this.emit('readable');
});
}
read(): T | null {
let item = this.source.read();
for (let i = 0; i < this.tranforms.length; i += 1) {
const transform = this.tranforms[i];
switch (transform.type) {
case 'map':
item = transform.function(item);
case 'filter': {
if (!transform.function(item)) {
item = this.source.read();
if (item === null)
break;
else
i = 0
}
}
}
}
// @ts-ignore
if (item === null && this.source._state === ENDED) {
this._end();
}
return item;
}
// @ts-ignore
filter(filter: (item: T) => boolean, self?: any): CompositeMapFilter<T> {
this.tranforms.push({ type: 'filter', function: filter });
return this;
}
// @ts-ignore
map(map: (item: T) => T): CompositeMapFilter<T> {
this.tranforms.push({ type: 'map', function: map });
return this;
}
}
class MappingIterator<I, O> extends AsyncIterator<O> {
constructor(source: AsyncIterator<I>, map: (item: I) => O) {
super();
source.on('readable', () => {
this.emit('readable');
});
let item: I | null;
this.read = (): O | null => {
const item = source.read();
if (item === null) {
// @ts-ignore
if (source._state === ENDED || source._state === CLOSED)
this._end();
return null
}
return map(item);
};
}
}
class FilterIterator<I> extends AsyncIterator<I> {
constructor(source: AsyncIterator<I>, filter: (item: I) => boolean) {
super();
source.on('readable', () => {
this.emit('readable');
});
let item: I | null;
this.read = (): I | null => {
let item;
while ((item = source.read()) !== null) {
if (filter(item))
return item;
}
// @ts-ignore
if (source._state === ENDED || source._state === CLOSED)
this._end();
return null;
};
}
}
const generateArr = (): number[] => {
let i = 0;
return new Array(200000)
.fill(true)
.map(() => i++);
};
const time = (createStream: (arr: number[]) => AsyncIterator<number> | CompositeMapFilter<number>): Promise<number> => {
const arr = generateArr();
const str = createStream(arr);
let count = 0;
return new Promise((resolve, reject) => {
const now = Date.now();
str.on('data', () => {
count += 1;
})
.on('end', () => {
const then = Date.now();
// console.log(count, arr.length / 2, arr)
if (count != arr.length / 2) {
// reject(new Error('Bad count'));
// return;
}
resolve(then - now);
});
})
}
const main = async () => {
const mapMethodTime = await time((arr) => {
let iterator: AsyncIterator<number> = range(0, arr.length - 1);
for (let i = 0; i < 50; i++) {
iterator = iterator.filter(item => item % 2 === 0);
iterator = iterator.map(item => item);
}
return iterator;
});
const mappingIteratorTime = await time((arr) => {
let iterator: AsyncIterator<number> = range(0, arr.length - 1);
for (let i = 0; i < 50; i++) {
iterator = new FilterIterator(iterator, item => item % 2 === 0);
iterator = new MappingIterator(iterator, item => item);
}
return iterator;
});
const compIteratorTime = await time((arr) => {
let iterator: CompositeMapFilter<number> = new CompositeMapFilter(range(0, arr.length - 1));
for (let i = 0; i < 50; i++) {
iterator = iterator.filter(item => item % 2 === 0);
iterator = iterator.map(item => item);
}
return iterator;
});
console.log(`ArrayIterator#map(): ${mapMethodTime}`);
console.log(`MappingIterator: ${mappingIteratorTime}`);
console.log(`CompositeIterator: ${compIteratorTime}`);
};
main().catch((err) => {
console.error(err);
process.exit(1);
});
You can see a further 50-100% performance improvement by 'pre-compiling' the CompositeTransform
- see https://gist.github.com/jeswr/f49a6975f1bfc088a9c32565e69761bf
Sorry to keep spamming this thread
@jeswr if all spam were like this we'd all be much happier about spam. Seriously, awesome work.
To summarize the results I have in the following comments
ArrayIterator
causes about a 5x speed degradation compared torange
iterator on 20000 elements (because as I raised here the V8 implementation ofArray.shift
is still very suboptimal).
Although I was not expecting shift()
to be optimized to the extent implied by @RubenVerborgh , I was also not expecting it to be slow to the point of eclipsing other differences in my tests. Unexpected and very good to know.
The latest revision of my gist at https://gist.github.com/jacoscaz/e20812991714092dbb20cfd432d1bf00/e6cbf3caab930e88edb62f34562619603572c42b has a RangeIterator
that appears to run 500x faster than ArrayIterator
, which a difference so enormous that I keep looking for mistakes on my end.
- Using the
range
iterator on 20000 elements and using customMapIterator
/FilterIterator
s we see approx. an 8x improvement.
Can confirm. When going from calling .map()
on the RangeIterator
to passing the RangeIterator
as the source
of MappingIterator
I, too, see an improvement of 7x - 9x.
- Using my PR which avoids scheduling a few unnecessary tasks (ie just calling the
end
event straight away where possible creates a further 10% improvement.)
Haven't tested this yet but I have no reason to object.
- Using a
CompositeTransform
that reduces over all of the maps and filters causes a further 10-15% performance improvement when compared to simply creating 50 newMap
andFilter
iterators- Pre-compile the transformation pipeline on the
CompositeTransform
and observe a 50%-100% performance gain
I can see the benefits of doing this but I'm wary of the fact that it breaks iterator immutability, which can be a source of bugs. I would still try to reap as much as possible by offering a MapFilteringIterator<I, O>
whose transform function is allowed to output null
, indicating that the I
item should be discarded.
Combining all of these factors can result in a speed-up of 100x or greater when doing 50
map
/filters
on 20000 elements; and similar results when we do smaller numbers ofmap
/filters
(e.g. 5).
Indeed, performance-wise this should turn out to be an extremely productive and effective investment of time.
RangeIterator
that appears to run 500x faster thanArrayIterator
Impressive - and I can't see any problems with it; though I wonder why it also seems to be a much greater improvement than using range
(i.e IntegerIterator
).
I'm wary of the fact that it breaks iterator immutability, which can be a source of bugs
Makes sense - I think there should be ways to give the feel of Immutability to the API (or at the very least throw meaningful errors for devs when they misuse a Mutable AsyncIterator) while using mutations like this under the hood for performance.
This is probably more appropriate in a separate package; but one option I was thinking of is as follows
PlaceholderIterator extends AsyncIterator<never> {
error() {
throw new Error('You are trying to use methods on an iterator - which has been destroyed by another transform operation')
}
map = this.error;
read = this.error;
...
}
const placeHolderIterator = new PlaceholderIterator();
class ComposeMapFilterIterator extends AsyncIterator<T> {
map(fn) {
this.transformers.push(fn);
const that = this;
this = placeHolderIterator
return that;
}
}
So this means if any potential usage that would cause bugs such as using .map
or .read
on upstream iterators; the problem immediately gets identified.
Furthermore the build
method that I had in the original implementation of this iterator could just be called on the first read
.
In fact I'm inclined to argue that such an implementation may lead to fewer bugs in some cases where devs may be trying to read data from an iterator that they shouldn't be.
@jeswr as I was investigating this, I came up with the same idea as https://github.com/RubenVerborgh/AsyncIterator/issues/38#issuecomment-1077584185 and had a go at it in https://github.com/jacoscaz/AsyncIterator/blob/42311b68988a971fc072c896100d7082b4162470/asynciterator.ts#L744-L783 . I'll make a PR later but I just realized we've put quite a lot of things on the table and we should probably make a plan of action, including what should go into asynciterator
and what should go in Comunica. That I can think of:
ArrayIterator
can be made incredibly faster by something like https://gist.github.com/jacoscaz/e20812991714092dbb20cfd432d1bf00/e6cbf3caab930e88edb62f34562619603572c42b#file-asynciteratortest-ts-L22-L38 . I guess this should be done into asynciterator
.MappingIterator
and FilteringIterator
would significantly optimize performance whenever .map()
and .filter()
are used. However, nothing prevents us from adding those two custom iterator into Comunica as opposed to modifying asynciterator
. Where should these go? I would say asynciterator
.BufferedIterator
to a linked-list (as opposed to the current array-based buffer) as per the links above. Where should this go? I would say asynciterator
. wrap
: https://github.com/comunica/comunica/issues/965 . Where should this go? Perhaps in Comunica, as we might want to tailor it to RDF/JS Stream
instances.end
events here https://github.com/RubenVerborgh/AsyncIterator/pull/45 . PR already open, this definitely belongs into asynciterator
.Tagging @rubensworks and @RubenVerborgh to keep everyone in the loop.
ArrayIterator can be made incredibly faster
Indeed - but its also probably worth tracking https://bugs.chromium.org/p/v8/issues/detail?id=12730 in case the V8 team solves this upstream for us. You would also want to delete the source buffer on the end event in your implementation.
I'm seeing ~10% increase in performance by switching BufferedIterator
Nice!
50x - 100x increase in performance by compositing transformations
Standalone it is only approx a ~2x improvement - IIRC the 100x figure came from compounding most of the factors above
but can see the point in having it upstream too. Where should this go?
I'd say this should be a new library of its own e.g. 'asynciterator-compose.js'
I'd say this should be a new library of its own e.g. 'asynciterator-compose.js'
Makes sense!
In theory, this should be mitigated by #46 although my testing has shown a significantly smaller performance gain than I had expected. @jeswr do you have time to do some testing of that branch on your use cases?
Still, doing away with buffering entirely would be a good idea, given the gains seen in #48 . I think we could port the source wrapping part of TransformIterator
into a dedicated WrappingIterator
to keep this within asynciterator
. How does that sound @RubenVerborgh @jeswr ?
@jeswr do you have time to do some testing of that branch on your use cases?
I've not done any benchmarking for wrap
- I imagine a degenerate case would be wrapping something like the range
iterator with a large number of iterations. Such a case, however, is a prime example of where doing away with buffering entirely would be better.
I think we could port the source wrapping part of
TransformIterator
into a dedicatedWrappingIterator
to keep this withinasynciterator
Sounds good to me
I'd say this should be a new library of its own e.g. 'asynciterator-compose.js'
Happy to have it here, really. Could be a separate file though (and maybe we could also split the code into more separate files in general).
As discussed with @jacoscaz the proposed timeline for resolving this issue is
(1) Get full coverage on current state of #57 . Perhaps minus the SyncUnionIterator
for now, and then get the contents of that PR merged into main (likely broken across a few PRs).
(2) Merge #45, because this resolves a lot of issues I had for the SyncUnionIterator
as it allows the iterator to enter a done state in a timely manner (i.e. during the first read that produces a null value rather than asynchronously afterwards). The reason I would do this after getting in the majority of performance improvements is that it does include breaking changes.
(3) Address SyncUnionIterator
(4) Address Cloning
Pinging @RubenVerborgh
In addition to the above plan shared by @jeswr , if nobody else needs these changes to live within a single branch for testing purposes I would suggest breaking the work done across the following PRs, to be refined and merged in order:
Faster synchronous transforms, which would include SynchronousTransformIterator
and its subclasses MappingIterator
, FilteringIterator
, SkippingIterator
, LimitingIterator
, MultiMapFilterTransformIterator
, MultiMapFilterIterator
and MaybeMappingIterator
from #57 . IMHO, these should be evaluated together as they share a common ancestor, itself a new addition to the codebase, and they point to one another via the various map()
, filter()
and other commodity methods. I would caution against breaking this further into separate PRs.
Faster wrapping, which would include work done on wrap()
and the new WrappingIterator
from #57 .
End on read, already working in #45
Faster (synchronous) union, already working in #54
Faster cloning, already started in https://github.com/jacoscaz/AsyncIterator/pull/1 (though we might change approach)
When taken all together, these changes will make for dramatic performance gains. I have one use case, which I unfortunately cannot share ATM, where things become 1200x faster simply by switching to 1 + 2 + 4.
Thanks for all the work and please proceed, with the following caveats:
SynchronousTransformIterator
; I'd need to see a clean version of the code, but things like maybeMap
sound like over-optimizationsThanks again, @jeswr and @jacoscaz!
I'm not sold yet on SynchronousTransformIterator; I'd need to see a clean version of the code, but things like maybeMap sound like over-optimizations
@RubenVerborgh is it the overall approach that does not convince you or is it the fact that we might have pushed it too far with things like maybeMap
? Asking just so I can better direct my efforts. The SynchronousTransformIterator
class per-se is just a way to centralize a bunch of code shared by SkippingIterator
, MappingIterator
and so on. Totally understand if you need a cleaner version of the code to answer this question.
is it the overall approach that does not convince you or is it the fact that we might have pushed it too far with things like
maybeMap
?
At the moment, I can't see the forest for the trees; the code needs a bit of love still. And some bits might take it too far indeed.
A few comments to keep everyone in the loop and for posterity...
While working on 5. Faster cloning, I've realized that my assumptions about the relative performance of Array#push()
and LinkedList#push()
were wrong. Even when dealing with millions of elements, Array#push()
consistently outperforms LinkedList#push()
. Furthermore, ClonedIterator#read()
already bypasses BufferedIterator
's internal buffering, making cloning already far more optimized than I had thought.
@jeswr has also been working on a non-breaking alternative to #54 in #65 for faster unions which also optimizes prepend()
and append()
(#67).
All this said, I think in order to close this issue and reap the benefits in Comunica we're looking at the following:
Also tagging @rubensworks .
. Even when dealing with millions of elements,
Array#push()
consistently outperformsLinkedList#push()
That is expected; but what we need to know is the behavior for repeated pushes and pulls, cfr. https://github.com/RubenVerborgh/AsyncIterator/issues/60#issuecomment-1088528533
That is expected; but what we need to know is the behavior for repeated pushes and pulls, cfr. https://github.com/RubenVerborgh/AsyncIterator/issues/60#issuecomment-1088528533
As far as my understanding goes and limited to cloning, nothing is ever pulled from the backing buffer as the entire history starting from when the first clone is created (EDIT) is maintained for future cloned iterators to go through; we only care about pushes done at https://github.com/RubenVerborgh/AsyncIterator/blob/fec14f4d19e3d594bc385b927863ebf80a792580/asynciterator.ts#L1845
Nonetheless, I've done some work on a LinkedList
-backed HistoryReader
here but saw no performance gains whatsoever, which eventually led me to question my assumptions on #push()
.
LinkedList
-backedHistoryReader
here but saw no performance gains whatsoeve
But that's because no data is ever pulled from it.
But that's because no data is ever pulled from it.
Yes, precisely! What I meant in https://github.com/RubenVerborgh/AsyncIterator/issues/44#issuecomment-1103902591 is that, given that HistoryReader
only pushes to its internal buffer while never pulling from it (by design) and given the fact that Array#push()
outperforms LinkedList#push()
even with millions of items, there's no more reason for me to work on a LinkedList
-backed refactoring of HistoryReader
, hence my dropping it from the updated list of PRs. Apologies for the lack of clarity.
Update for posterity and to keep track of the general progress on this. Tagging @rubensworks @jeswr @RubenVerborgh .
Open PRs that we're working on, to be reviewed / merged / rebased in order:
SynchronousTransformIterator
class
We're halfway through! Faster unions are next. @RubenVerborgh @jeswr @rubensworks do you think we can get these merged by the end of august?
@jacoscaz 4 might be a significant amount of work to rebase and check for correctness; so I wouldn't bet on that by end of August. The other two should be much less work but will depend on everyones schedules.
My plan is to wrap up AsyncIterator v3.x soonish; the moment we start tackling #45, we're in the v4.x territory. But I guess a v4.x alpha could be possible.
@RubenVerborgh @jeswr perhaps it would make sense to do faster unions and the synchronous transform iterator first, wrapping up 3.x, and leave end on read last. Would that work for you guys?
I agree with the strategy - will try and wrap up work on the faster unions today
Update for posterity and to keep track of the general progress. Tagging @rubensworks @jeswr @RubenVerborgh .
Open PRs that we're working on, to be reviewed / merged / rebased in order:
SynchronousTransformIterator
class
EDITED 2022-08-26: faster unions are now breaking changes, moved to 4.x
I kinda lost track of where things are. Is the v4 PR still being worked on?
I kinda lost track of where things are. Is the v4 PR still being worked on?
Somewhat fell off my radar for a bit; I do have most of the code for a new version locally but there is still work do be done for the tests.
For basic methods like
.map
and.filter
, onAsyncIterator
, using theSimpleTransformIterator
introduces uneccessary overhead via thereadAndTransformSimple
operator. To improve performance in applications like Comunica it may be worth creating custom iterators with simpler read methods; for instance a synchronousMapIterator
could skip buffering entirely and just be implemented as something like@jacoscaz I suspect this is the cause of the potential slowdown you were encountering when doing a large amount of chaining.