RubenVerborgh / AsyncIterator

An asynchronous iterator library for advanced object pipelines in JavaScript
https://rubenverborgh.github.io/AsyncIterator/docs/
Other
48 stars 7 forks source link
async asynchronous iterator javascript typescript

Asynchronous iterators for JavaScript

Build Status Coverage Status npm version

AsyncIterator is a lightweight JavaScript implementation of demand-driven object streams, and an alternative to the two-way flow controlled Node.js Stream. As opposed to Stream, you cannot push anything into an AsyncIterator; instead, an iterator pulls things from another iterator. This eliminates the need for expensive, complex flow control.

Read the full API documentation.

Data streams that only generate what you need

AsyncIterator allows functions to return multiple asynchronously and lazily created values. This adds a missing piece to JavaScript, which natively supports returning a single value synchronously and asynchronously (through Promise), but multiple values only synchronously (through Iterable):

  single value multiple values
synchronous T getValue() Iterable<T> getValues()
asynchronous Promise<T> getValue() AsyncIterator<T> getValues()

Like Iterable, an AsyncIterator only generates items when you ask it to. This contrast with patterns such as Observable, which are data-driven and don't wait for consumers to process items.

The asynchronous iterator interface

An asynchronous iterator is an object that exposes a series of data items by:

Any object that conforms to the above conditions can be used with the AsyncIterator library (this includes Node.js Streams). The AsyncIterator interface additionally exposes several other methods and properties.

Example: fetching Wikipedia links related to natural numbers

In the example below, we create an iterator of links found on Wikipedia pages for natural numbers.

import https from 'https';
import { resolve } from 'url';
import { IntegerIterator } from 'asynciterator';

// Iterate over the natural numbers
const numbers = new IntegerIterator({ start: 0, end: Infinity });
// Transform these numbers into Wikipedia URLs
const urls = numbers.map(n => `https://en.wikipedia.org/wiki/${n}`);
// Fetch each corresponding Wikipedia page
const pages = urls.transform((url, done, push) => {
  https.get(url, response => {
    let page = '';
    response.on('data', data => { page += data; });
    response.on('end',  () => { push(page); done(); });
  });
});
// Extract the links from each page
const links = pages.transform((page, done, push) => {
  let search = /href="https://github.com/RubenVerborgh/AsyncIterator/blob/main/([^"]+)"/g, match;
  while (match = search.exec(page))
    push(resolve('https://en.wikipedia.org/', match[1]));
  done();
});

We could display a link every 0.1 seconds:

setInterval(() => {
  const link = links.read();
  if (link)
    console.log(link);
}, 100);

Or we can get the first 30 links and display them:

links.take(30).on('data', console.log);

In both cases, pages from Wikipedia will only be fetched when needed—the data consumer is in control. This is what makes AsyncIterator lazy.

If we had implemented this using the Observable pattern, an entire flow of unnecessary pages would be fetched, because it is controlled by the data publisher instead.

Usage

AsyncIterator implements the EventEmitter interface and a superset of the Stream interface.

Consuming an AsyncIterator in on-demand mode

By default, an AsyncIterator is in on-demand mode, meaning it only generates items when asked to.

The read method returns the next item, or null when no item is available.

const numbers = new IntegerIterator({ start: 1, end: 2 });
console.log(numbers.read()); // 1
console.log(numbers.read()); // 2
console.log(numbers.read()); // null

If you receive null, you should wait until the next readable event before reading again. This event is not a guarantee that an item will be available.

links.on('readable', () => {
  let link;
  while (link = links.read())
    console.log(link);
});

The end event is emitted after you have read the last item from the iterator.

Consuming an AsyncIterator in flow mode

An AsyncIterator can be switched to flow mode by listening to the data event. In flow mode, iterators generate items as fast as possible.

const numbers = new IntegerIterator({ start: 1, end: 100 });
numbers.on('data', number => console.log('number', number));
numbers.on('end',  () => console.log('all done!'));

To switch back to on-demand mode, simply remove all data listeners.

Setting and reading properties

An AsyncIterator can have custom properties assigned to it, which are preserved when the iterator is cloned. This is useful to pass around metadata about the iterator.

const numbers = new IntegerIterator();
numbers.setProperty('rate', 1234);
console.log(numbers.getProperty('rate')); // 1234

const clone = numbers.clone();
console.log(clone.getProperty('rate'));   // 1234

numbers.setProperty('rate', 4567);
console.log(clone.getProperty('rate'));   // 4567

You can also attach a callback that will be called as soon as the property is set:

const numbers = new IntegerIterator();
numbers.getProperty('later', console.log);
numbers.setProperty('later', 'value');
// 'value'

Consuming an AsyncIterator as EcmaScript-AsyncIterator

Due to the syntactical sugar EcmaScript's AsyncIterator provides, our iterators can also be consumed as such. If high performance over large iterators is required, this method of consumption not recommended.

const numbers = new IntegerIterator({ start: 1, end: 100 });

for await (const number of numbers)
  console.log('number', number);
console.log('all done!');

Error events emitted within the iterator can be caught by wrapping the for-await-block in a try-catch.

In cases where the returned EcmaScript AsyncIterator will not be fully consumed, it is recommended to manually listen for error events on the main AsyncIterator to avoid uncaught error messages.

License

The asynciterator library is copyrighted by Ruben Verborgh and released under the MIT License.