Open oskardudycz opened 3 years ago
The four options are:
await client.readStreamToArray('my-stream');
await client.readStream('my-stream').toArray();
import { streamToArray } from '@eventstore/db-client';
const eventStream = client.readStream('my-stream');
const eventArray = await streamToArray(eventStream);
import { asyncToArray } from 'iter-tools';
const events = client.readStream('my-stream');
const eventArray = await asyncToArray(events);
1: We need to be careful to name it something that wont break with future versions of JS (so not toArray
). https://github.com/tc39/proposal-iterator-helpers#toarray
Proposal:
Extend streaming read with collect
:
export interface StreamingRead<E> extends Readable {
// ...
collect(): Promise<E[]>;
collect<T>(
reducer: (acc: T, event: E, i: number, self: this) => T,
initialValue: T
): Promise<T>;
}
This allows you to:
Easily collect to an array:
const eventArray = await client.readStream("my-stream").collect();
Collect to something other than array:
const eventSet = await client
.readStream("my-stream")
.collect((acc, event) => acc.add(event), new Set());
Lazy Map:
const ids = await client
.readStream("my-stream")
.collect((acc, { event }) => [...acc, event.id], []);
Lazy Filter:
const wanted = await client.readStream("my-stream").collect((acc, { event }) => {
if (event.data.isWanted) {
acc.push(event);
}
return acc;
}, []);
I'm good with the collect
method proposed above 👍
I would suggest trying to leverage Array.from
since it is meant to be used for such cases if possible. Otherwise, I would rather read toArray
that takes no arguments.
And also, add .reduce
as well if you want to keep people the opportunity to reduce the events. Feels odd to reuse collect
or use the word collect
for such a use case.
But Array.from
already allows you to do such a thing as well, if possible.
Also, it seems to be that there was some interest at some point about .toArray
being a thing in JavaScript (Stage 2)
Hi @yordis ,
Thanks for the input, we're still discussing this so any feedback is helpful / appreciated.
I would suggest trying to leverage
Array.from
since it is meant to be used for such cases if possible.
As far as I am aware, Array.from
is synchronous only, so takes either something with a length attribute, or with an @@iterator
method. readStream
returns a nodejs ReadableStream
, which provides an asyncIterator
, so won't work with Array.from
. Unless I am missing something?
Feels odd to [..] use the word collect [..]
[..]
.toArray
being a thing in JavaScript (Stage 2)
This was the reason for suggesting collect
as a term, as mentioned in the footnote of this comment. I don't want to clash with any apis that end up being added to ReadableStream
at some point. collect
is maybe a bit of a rust-ism, perhaps fold
would be a better choice.
If I am not mistaken, a lot of conversations around making async everything lately, the information from TC39 is all over the place sometimes, so excuse if I don't link enough info.
Collect does the work, honestly, I don't mind collect
.
They are opening the APIs and extending it just so people align around it, so could be worth seeing what TC39 is up to and maybe following the guidelines so far, it is stage 2.
After changing
readStream
andreadAll
methods to return the NodeJS stream instead of an array, it's impossible to get an array of events without using the handcrafted helper method or external library (e.g. https://www.npmjs.com/package/iter-tools).Currently you either have to iterate through it, e.g.:
It's impossible to get the array immediately or use methods like
map
or other transformations without using, e.g. NodeJS stream transformation.I think that we should add the helper method
toArray
orcollect
or other name that will wrap the(or do it the smarter way).
The other option is to have the
take
method with the number of events we'd like to take, but I don't see that being useful. From my experience, 99% of the time, you just want to read the whole stream to rebuild the state. You won't know what's the number of events you have in the stream. Plus, we already removed this need by makingmaxCount
optional (as in other clients). If we add the limitation, people would still need to write their own helpers to page the event streams.I understand that we should have the memory effective option to fulfil the needs of people that have longer streams. However, I also believe that we should also enable people who have proper short streams to do it easily. Of course, other packages do it more efficiently. We can mention them in the docs or use them internally. I think that we should not optimise our API for edge cases. The entry point to using EventStoreDB is already high enough. We should lower it.
This feature request is not a breaking change. People can use the NodeJS streams, as they're now in v2. It's just to make the usage more accessible.