sindresorhus / p-map

Map over promises concurrently
MIT License
1.28k stars 58 forks source link

Possible inefficiency in `pMapIterable` #78

Open dmitri-gb opened 1 month ago

dmitri-gb commented 1 month ago

pMapIterable can waste some concurrency slots when the source async iterable produces items slowly.

Here's an example:

import { setTimeout } from 'timers/promises';
import { pMapIterable } from 'p-map';

async function* source() {
  for (let i = 1; i <= 2; ++i) {
    console.log('Producing item', i);
    await setTimeout(2000);
    yield i;
  }
}

const target = pMapIterable(source(), async n => {
  console.log('Running with', n);
  await setTimeout(2000);
  console.log('Finished running with', n);
}, { concurrency: 1 });

for await (const _ of target) { }

The above program outputs:

Producing item 1
Running with 1
Finished running with 1
Producing item 2
Running with 2
Finished running with 2

The source async iterable only starts producing the second item after the mapper function has handled the first one, meaning there's a 2 second pause between the two mapper invocations. The total running time of the above code is ~8 seconds.

There is no reason though, why the producing of the second item couldn't start sooner, so that by the time the first mapper invocation finishes, the second one wouldn't have to be delayed by so long. In other words, the producing of the next item and the mapping of the current item(s) could overlap. The running time of the above example could be reduced to ~6 seconds.

The effect of this issue becomes more pronounced when creating a pipeline of pMapIterable steps. Following is an example where 10 items are piped through 3 different mappings (with concurrency 1).

import { setTimeout } from 'timers/promises';
import { pMapIterable } from 'p-map';

async function* source() {
  for (let i = 1; i <= 10; ++i) {
    console.log('Producing item', i);
    await setTimeout(1000);
    yield i;
  }
}

const stepA = pMapIterable(source(), async n => {
  console.log('Begin A with', n);
  await setTimeout(1000);
  console.log('End A with with', n);
  return n;
}, { concurrency: 1 });

const stepB = pMapIterable(stepA, async n => {
  console.log('Begin B with', n);
  await setTimeout(1000);
  console.log('End B with', n);
  return n;
}, { concurrency: 1 });

const stepC = pMapIterable(stepB, async n => {
  console.log('Begin C with', n);
  await setTimeout(1000);
  console.log('End C with', n);
  return n;
}, { concurrency: 1 });

for await (const _ of stepC) { }

It takes ~22 seconds with the current pMapIterable implementation. This could be reduced to ~13 seconds, which is the required minimum time (10s to produce 10 items + 3s to map the last item).

FYI @Richienb

Richienb commented 1 month ago

I think having two backpressures, one for function input and one for function output might be the way to go.

https://github.com/sindresorhus/p-map/pull/77#issuecomment-2290862306