waku-org / js-waku

JavaScript implementation of Waku v2
https://js.waku.org
Apache License 2.0
168 stars 42 forks source link

Have a field to fetch last-n messages from Waku-Store in QueryOptions. #307

Closed ritvij14 closed 3 years ago

ritvij14 commented 3 years ago
This is a **change request/support request** ## Problem Right now, while fetching messages from waku-store using js-waku, if the number of messages is quite high then it takes quite a while to load the messages.

Proposed Solutions

So can we have an extra query option which only fetches the last-N messages from the DB?

Something like:

waku.store
      .queryHistory([ContentTopic], { count: 20 })
// fetches only the last 20 messages, if less than 20, then all are fetched.
D4nte commented 3 years ago

Thanks for your interest in building with js-waku!

I realized that the documentation on WakuStore.queryHistory is not being rendered on the doc website making it difficult to understand the API. I will fix this with https://github.com/status-im/js-waku/issues/306.

Regarding the problem you are facing, you mentioned that "it takes a while to load the messages".

If you retrieve messages by processing the returned value of WakuStore.queryHistory:


const messages = await waku.store.queryHistory([contentTopic]);
processMessages(messages);

// Or

waku.store.queryHistory([contentTopic]).then((messages) => {
  processMessages(messages);
});

Then indeed, processMessages (your custom function), will only be called once all messages have been retrieved. This is after several queries to the waku store server as it uses pagination and it walks through/retrieves all pages.

First, be sure to pass a content topic to your query so that you only retrieve messages relevant to your app/function. Also be sure to use a content topic that is specific to your app. I would not recommended using existing content topics such as /toy-chat/2/huilong/proto. Happy to help if you have questions around content topic. Please check this guide: https://github.com/status-im/js-waku/blob/main/guides/choose-content-topic.md

There are two ways to retrieve messages faster. Both methods can be used separately or together:

1. Use the callback parameter.

Instead of passing the result of WakuStore.queryHistory to your custom hook (processMessages), you can pass the hook as callback parameter:

waku.store.queryHistory([ContentTopic], { callback: processMessages });

Now, processMessages will be called for each retrieved page of result. Which means that it will be called sooner: as soon as the first page of result is received, and multiple times: for each received page.

Note that you can also specify the size of the pages with the pageSize property. It defaults to 10.

You can try to decrease the page size to have the first result (and hence first call to processMessages) sooner, or increase to process more messages per processMessages call.

e.g.:

waku.store.queryHistory([ContentTopic], { callback: processMessages, pageSize: 5 });

2. Use time window filtering

The timeFilter option allows to filter result per message send time. This makes it easier to retrieve only the last hour or day of messages, and then do further query if your app needs to fetch further in the past.

This option takes JavaScript Dates in input.

E.g.:


const endTime = new Date();
const startTime = new Date();
// Set `startTime` to 24 hours in the past: 24 hours * 60 minutes * 60 seconds * 1000 milliseconds
startTime.setTime(startTime.getTime() - 24 * 60 * 60 * 1000);

// Retrieve messages for the past day
const messages = await waku.store.queryHistory([contentTopic], {
    timeFilter: { startTime, endTime },
});

Now, regarding the API you suggested. Is it still needed considering the options I described above?

I am just not sure that asking for a given number of messages is going to be that helpful without making the API more complex. Indeed, if you ask for the 10 mpst recent messages, and you do not have access to the message you are looking for, then what next?

  1. Would you need an API to access the 11 to 20 last messages?
  2. Would you extract the time of the 10th messages and extract messages older than that?

if using the methods I proposed in the first section do not help, can you please describe your use case a bit more see if we can design an API that better fits your needs?

Cheers

ritvij14 commented 3 years ago

@D4nte thanks for the explanation, there are 2 things I would like to discuss:

  1. Regarding the callback thing, it might make things faster for me, but I think I am doing some mistake here, so it would be great if you could point it out. I am using this function as callback, but the array os messages I get inside .then() is not the processed one.

    const processMessages = (messages) => {
    messages.map((msg) => {
      if (!msg.payload) return;
    
      const { timestamp, text, sender } = proto.SimpleChatMessage.decode(
        msg.payload
      );
    
      const time = new Date();
      time.setTime(timestamp);
    
      const utf8Text = Buffer.from(text).toString("utf-8");
    
      msg = { timestamp, text: utf8Text, sender };
    });
    };
  2. Regarding your last point, actually I am working on the chat feature of a live-streaming platform, so now the thing is that we don't want the user to wait for too long while the messages are fetched, and also we know that the frequency at which messages are sent on livestreams depend on the streamer's popularity.

So in a scenario where a viewer joins a little late, and misses out lets say 200 messages, the platform will need time to fetch the past 200 messages, but then again, we know that usually they aren't actually relevant so, just the last 20 messages would be more than enough. So hence why I asked for a last-N sort of query.

D4nte commented 3 years ago

Regarding 1:

I am using this function as callback, but the array os messages I get inside .then() is not the processed one.

Sorry I don't understand, if you are using the callback option then you should not need a then.

It should just be:

waku.store.queryHistory([ContentTopic], { callback: processMessages });

But the processMessages function currently does not do anything with msg. You would need to save it in a state or something.

Note that the callback option does not altered the messages returned by queryHistory. The callback option gives you the opportunity to process messages as they are retrieved.

D4nte commented 3 years ago

Please check: 4d44cff (#308) hopefully it helps.

D4nte commented 3 years ago

Regarding your last point, actually I am working on the chat feature of a live-streaming platform, so now the thing is that we don't want the user to wait for too long while the messages are fetched, and also we know that the frequency at which messages are sent on livestreams depend on the streamer's popularity.

Please have a go at my proposal above in combination with timeFilter and let me know if this is still needed.

D4nte commented 3 years ago

@ritvij14 how is it going?

I am actually facing a situation where I may need an API similar to what you need. What about changing the callback type to (messages: WakuMessage[]) => void|boolean and if callback returns true we stop querying pages?

you can then keep a counter on your side to stop after you retrieve N messages.

ritvij14 commented 3 years ago

@D4nte so the callback thing right now works great, and the fetching is faster. But sometimes not all messages are fetched and sometimes they are, like the user needs to reload stuff.

Also yes, your idea sounds good, in our case when in livestream there were chances that 1-2 messages might get missed while the fetching of messages while sending times and also we won't have control over the number of messages being fetched, so this modification to callback might be actually better.

D4nte commented 3 years ago

@D4nte so the callback thing right now works great, and the fetching is faster. But sometimes not all messages are fetched and sometimes they are, like the user needs to reload stuff.

Feel free to report the errors you see when messages do not get fetched.

D4nte commented 3 years ago

See https://github.com/status-im/js-waku/pull/310

The test demonstrates how to query a limited number of messages.

ritvij14 commented 3 years ago

@D4nte thanks! One doubt though, this is my current code for callback function:

const processWakuHistory = (retrievedMessages) => {
    const messages = retrievedMessages
      .map((msg) => processMsgs(msg))
      .filter(Boolean);

    setWakuMsgs((waku) => {
      return messages.concat(waku);
    });

    if (wakuMsgs.length === 20) return true;
  };

But on one of my content topics, I sent 24 messages and all of them got fetched. Am I doing something wrong?

D4nte commented 3 years ago

Are you using js-waku master? Because I haven't released a new version of js-waku containing the code change.

The messages are still retrieves per page. By default 1 page contains 10 messages. You can change that with the pageSize parameter.

You use .filter(Boolean) which means that messages may contain less than 10 messages at a time.

Let's say on the first page, you filter 1 message out so wakuMsgs.length is 9. then, you filter 0 message out, so wakuMsgs.length is 19. then, you have to retrieve the 3rd page (24 messages fits in 3 pages). Hence, you have retrieved all messages.

D4nte commented 3 years ago

The change is now released in 0.14.0. I see you also fixed your code. Closing, please re-open or open a new issue if you face any other problem.