feat: faster wrapping - Githubissues

jacoscaz commented 2 years ago

This PR covers step 2. in the plan described in https://github.com/RubenVerborgh/AsyncIterator/issues/44#issuecomment-1086950726 and aims to 1) lower the overhead of the wrap() function and 2) expand support to additional types of sources (namely Iterator and Iterable objects).

The rationale for this PR is taking the onus of figuring out how to best wrap common types of data sources away from users of this library. Users should be able to expect wrap() to use the strategy that is most appropriate for any given source (as long as it's a supported source).

Performance-wise, this PR does two things:

1) It introduces source-specific iterators that synchronously read from their sources, skipping the internal buffering done by TransformIterator (i.e. the class currently used for wrapping)

2) It introduces a specific option that allows AsyncIterator instances to fall-through as they are, with no wrapping applied

The following test, admittedly naive, takes ~116 ms with the current main branch and ~8 ms with this PR (letIteratorThrough makes virtually no difference). This mostly highlight the difference made by skipping TransformIterator's internal buffering.

let i = 0;
const arr = new Array(200_000).fill(true).map(() => i++);
const source = new ArrayIterator(arr);
const opts = { letIteratorThrough: false };
const then = Date.now();
wrap(source, opts).on('data', () => {}).on('end', () => {
  console.log('elapsed', Date.now() - then, 'ms');
});

Things become much more interesting when we consider what happens with a chain established by wrap()ing:

let i = 0;
const arr = new Array(200_000).fill(true).map(() => i++);
const source = new ArrayIterator(arr);
const opts = { letIteratorThrough: false };
const then = Date.now();
wrap(wrap(wrap(wrap(source, opts), opts), opts), opts).on('data', () => {}).on('end', () => {
  console.log('elapsed', Date.now() - then, 'ms');
});

The above test takes ~250 ms with the current main branch, ~ 15 ms with this PR and letIteratorThrough: false, ~ 8 ms with this PR and letIteratorThrough: true. I haven't seen consistent differences with or without warmups.

My use case is the optimization of the interface between quadstore, which implements the RDF/JS Source interface, and comunica, through which the former can be queried via SPARQL: https://github.com/comunica/comunica/blob/a2f31270778730911bd0bdf8b8c7ade567153319/packages/actor-rdf-resolve-quad-pattern-rdfjs-source/lib/RdfJsQuadSource.ts#L29 .

As many algebraic operation (matching, joining, ...) are happening in-memory, thus requiring the streaming of large amounts of quads from the persistence layer into the engine, removing even just one layer of indirection/buffering can make a significant difference.

jacoscaz commented 2 years ago

I should add that this PR is not ready to be merged as it's lacking tests and I'm sure the code can be improved. However, it is ready for a first round of feedbacks to better direct my efforts. Tagging @jeswr , @rubensworks and @RubenVerborgh to keep everyone in the loop. BTW, this would have to be rebased on main once #59 gets merged.

jeswr commented 2 years ago

Things become much more interesting when we consider what happens with a chain established by wrap()ing:

I've also seen this pattern in the core of Comunica (with a few synchronous transforms in between), mainly in the context of handling iterators that are wrapped in a promise. It would be good to have benchmarks on that case as well - i.e.

let i = 0;
const arr = new Array(200_000).fill(true).map(() => i++);
const source = new ArrayIterator(arr);
const opts = { letIteratorThrough: false };
const then = Date.now();
wrap(Promise.resolve(wrap(Promise.resolve(wrap(Promise.resolve(wrap(Promise.resolve(source), opts)), opts)), opts)), opts).on('data', () => {}).on('end', () => {
  console.log('elapsed', Date.now() - then, 'ms');
});

jacoscaz commented 2 years ago

@jeswr I've tweaked your test a bit to ignore the time spent establishing the chain, just in case it could make a difference (likely insignificant, but still).

let i = 0;
const arr = new Array(200_000).fill(true).map(() => i++);

const source = new ArrayIterator(arr);
const opts = { /* letIteratorThrough: true */ };

const chain = wrap(Promise.resolve(wrap(Promise.resolve(wrap(Promise.resolve(wrap(Promise.resolve(source), opts)), opts)), opts)), opts);

chain.once('readable', () => {
  const then = Date.now();
  chain.on('data', () => {}).on('end', () => {
    console.log('elapsed', Date.now() - then, 'ms');
  });
});

This test runs in ~450 ms with current main and ~12 ms with this PR. letIteratorThrough makes no difference as having to return an iterator synchronously forces the use of WrappingIterator even for AsyncIterator sources as we still need to wait for the promise to asynchronously resolve.

jacoscaz commented 2 years ago

Not sure about what's going on with coveralls but this is probably ready for review by @RubenVerborgh (in due time, after #59 is merged). I'll resume working on cloning.

jacoscaz commented 2 years ago

@RubenVerborgh I have just rebased this one on main. I'm not sure as to what is happening with the failing test on Node 10.x but I would prefer if you were to take a look first before doing further work on this.

RubenVerborgh commented 2 years ago

Thanks, removed the second option 16fcfb638a97df3b26cb11aa3d4da060b120c146 and will talk to Jesse over lunch about the other one!

jacoscaz commented 2 years ago

Awesome!

jeswr commented 2 years ago

Thanks, removed the second option https://github.com/RubenVerborgh/AsyncIterator/commit/16fcfb638a97df3b26cb11aa3d4da060b120c146 and will talk to Jesse over lunch about the other one!

The request for that was in relation to performance when wrapping N3.js.

By default wrap(store.match(...)) would end up using fromAsyncReadable which in turn uses buffering within the WrappingIterator (and all results get buffered into an array as a result of this). As opposed to just using the IterableWrapper which is much faster as it just pulls out quads on-demand. IIRC we cannot change the default prioritization order for backwards compatibility reasons.

The main consequence of removing this is that downstream clients like Comunica will sometimes need to write

 if ((isIterable(source) || isIterator(source)))
    return fromIterable<T>(source);
 else if
    return wrap<T>(source);

rather than

return wrap(source, { prioritizeIterable: true })

RubenVerborgh commented 2 years ago

Thanks @jeswr; I'll take a look at the backwards compat thing too then. But we should def put performance-friendly options first. (Also this will be one of the last changes before semver.major, so then we can happily break.)

jacoscaz commented 2 years ago

Thanks @jeswr; I'll take a look at the backwards compat thing too then. But we should def put performance-friendly options first. (Also this will be one of the last changes before semver.major, so then we can happily break.)

Thanks @jeswr for the refresher. I hadn't realized we already have a use case for the prioritizeIterable flag. Long-term, I suspect more such flags might follow (e.g. for native async iteration). Perhaps an explicit ordering might be better, here? Something like priority: ['iterable', 'asynciterable', 'array', ...]?

RubenVerborgh commented 2 years ago

@jacoscaz @jeswr Done my in-depth review—what do you think?

jacoscaz commented 2 years ago

I like where this ended at! Thanks @RubenVerborgh , +1 from me to merge and I’ll try to rebase #78 post-merge as soon as possible so that it can also be merged, completing work on wrapping.

RubenVerborgh commented 2 years ago

A major thank you to @jacoscaz as well as @jeswr for your efforts and patience.

I realize it has taken me a lot of time to review and complete this PR, so I definitely wanted to emphasize my appreciation for helping out with open source.

Because of its precise and high-performance nature, AsyncIterator is a special project in the sense that every PR tends to require one uninterrupted day for me to work on it. There's a lot of thinking work on finding which code makes it easiest to reason about correct behavior in a wide array of edge cases. So if you find me replacing or rewriting large chunks of code you have contributed, it's not you—it's me! Basically the hard work is finding the initial code that works, and covering it with an appropriate set of extensive tests, which is what you've done. Then it's my job to re-interpret the code that satisfies that those tests, and that's what tends to take me quite some time to get right.

Thanks for bearing with me, and I look forward to getting to the next PRs!

RubenVerborgh commented 2 years ago

Published as v3.6.0 🥳

jacoscaz commented 2 years ago

Because of its precise and high-performance nature, AsyncIterator is a special project in the sense that every PR tends to require one uninterrupted day for me to work on it.

@RubenVerborgh thank you for this library in the first place and all the effort that goes into maintaining it! I enjoy your reviews as they always lead to objectively better code from a correctness / performance standpoint. I also enjoy seeing the same code rewritten in a style that is not the one I started out with - there's always a great deal to learn.

RubenVerborgh / AsyncIterator

feat: faster wrapping #63