Feature Request: Adding the helper method to read the array of events

oskardudycz commented 3 years ago

After changing readStream and readAll methods to return the NodeJS stream instead of an array, it's impossible to get an array of events without using the handcrafted helper method or external library (e.g. https://www.npmjs.com/package/iter-tools).

Currently you either have to iterate through it, e.g.:

const resultEvents = client.readStream(streamName, {
  fromRevision: START,
  direction: FORWARDS,
  maxCount: 10,
});

const events = [];
for await (const event of resultEvents) {
  events.push(event);
}
return events;

It's impossible to get the array immediately or use methods like map or other transformations without using, e.g. NodeJS stream transformation.

I think that we should add the helper method toArray or collect or other name that will wrap the

const events = [];
for await (const event of resultEvents) {
  events.push(event);
}
return events;

(or do it the smarter way).

The other option is to have the take method with the number of events we'd like to take, but I don't see that being useful. From my experience, 99% of the time, you just want to read the whole stream to rebuild the state. You won't know what's the number of events you have in the stream. Plus, we already removed this need by making maxCount optional (as in other clients). If we add the limitation, people would still need to write their own helpers to page the event streams.

I understand that we should have the memory effective option to fulfil the needs of people that have longer streams. However, I also believe that we should also enable people who have proper short streams to do it easily. Of course, other packages do it more efficiently. We can mention them in the docs or use them internally. I think that we should not optimise our API for edge cases. The entry point to using EventStoreDB is already high enough. We should lower it.

This feature request is not a breaking change. People can use the NodeJS streams, as they're now in v2. It's just to make the usage more accessible.

George-Payne commented 3 years ago

The four options are:

Separate method

await client.readStreamToArray('my-stream');

Extension ¹

await client.readStream('my-stream').toArray();

Helper

import { streamToArray } from '@eventstore/db-client'; 
const eventStream = client.readStream('my-stream');
const eventArray = await streamToArray(eventStream);

Use a library (current)

import { asyncToArray } from 'iter-tools'; 
const events = client.readStream('my-stream');
const eventArray = await asyncToArray(events);

1: We need to be careful to name it something that wont break with future versions of JS (so not toArray). https://github.com/tc39/proposal-iterator-helpers#toarray

George-Payne commented 3 years ago

Proposal:

Extend streaming read with collect:

export interface StreamingRead<E> extends Readable {
  // ...
  collect(): Promise<E[]>;
  collect<T>(
    reducer: (acc: T, event: E, i: number, self: this) => T,
    initialValue: T
  ): Promise<T>;
}

This allows you to:

Easily collect to an array:

const eventArray = await client.readStream("my-stream").collect();

Implementation with current api

```ts const eventArray = []; for await (const event of client.readStream("my-stream")) { eventArray.push(event); } ```

Collect to something other than array:

const eventSet = await client
  .readStream("my-stream")
  .collect((acc, event) => acc.add(event), new Set());

Implementation with current api

```ts const eventSet = new Set(); for await (const event of client.readStream("my-stream")) { eventSet.add(event); } ```

Lazy Map:

const ids = await client
  .readStream("my-stream")
  .collect((acc, { event }) => [...acc, event.id], []);

Implementation with current api

```ts const ids = []; for await (const { event } of client.readStream("my-stream")) { ids.push(event.id); } ```

Lazy Filter:

const wanted = await client.readStream("my-stream").collect((acc, { event }) => {
  if (event.data.isWanted) {
    acc.push(event);
  }
  return acc;
}, []);

Implementation with current api

```ts const wanted = []; for await (const { event } of client.readStream("my-stream")) { if (event.data.isWanted) { wanted.push(event); } } ```

oskardudycz commented 3 years ago

I'm good with the collect method proposed above 👍

yordis commented 2 years ago

I would suggest trying to leverage Array.from since it is meant to be used for such cases if possible. Otherwise, I would rather read toArray that takes no arguments.

And also, add .reduce as well if you want to keep people the opportunity to reduce the events. Feels odd to reuse collect or use the word collect for such a use case.

But Array.from already allows you to do such a thing as well, if possible.

yordis commented 2 years ago

Also, it seems to be that there was some interest at some point about .toArray being a thing in JavaScript (Stage 2)

George-Payne commented 2 years ago

Hi @yordis ,

Thanks for the input, we're still discussing this so any feedback is helpful / appreciated.

I would suggest trying to leverage Array.from since it is meant to be used for such cases if possible.

As far as I am aware, Array.from is synchronous only, so takes either something with a length attribute, or with an @@iterator method. readStream returns a nodejs ReadableStream, which provides an asyncIterator, so won't work with Array.from. Unless I am missing something?

Feels odd to [..] use the word collect [..]

[..] .toArray being a thing in JavaScript (Stage 2)

This was the reason for suggesting collect as a term, as mentioned in the footnote of this comment. I don't want to clash with any apis that end up being added to ReadableStream at some point. collect is maybe a bit of a rust-ism, perhaps fold would be a better choice.

yordis commented 2 years ago

If I am not mistaken, a lot of conversations around making async everything lately, the information from TC39 is all over the place sometimes, so excuse if I don't link enough info.

Collect does the work, honestly, I don't mind collect.

They are opening the APIs and extending it just so people align around it, so could be worth seeing what TC39 is up to and maybe following the guidelines so far, it is stage 2.

EventStore / EventStore-Client-NodeJS