linagora / esn-frontend-inbox

Webmail SPA for the OpenPaaS Suite - https://open-paas.org
Other
11 stars 19 forks source link

Cache strategy to reduce calls to the people API #447

Open vincent-zurczak opened 3 years ago

vincent-zurczak commented 3 years ago

With Inbox, all the email authors are searched through the people API. It occurs when the page opens up or when we scroll and retrieve older emails. This API is used to find people names and avatars.

It is not clear at the moment if the people API will remain and if there will be common services. But the short-term deadlines require to improve the current implementation, and using the browser's local storage to store information would save some requests. When there are thousands of users, saving invocations makes a change.

The idea is that inbox should save a map with user names and their avatar's address. If the author of an email is not found locally, then it will be asked to the people API and added into the cache. Even users that are not known by the people API (i.e. they are not part of the domain, work for another company, ...) should be added in the local cache.

Cache entries should have a time-to-live (3 days seems reasonable).

chamerling commented 3 years ago

Some comments:

The ESN already provides a Cache service, but it caches objects in memory with a TTL. When the item is not available in cache or when cache expires, it will call a function which resolve with a resource (the person in our case).

I suggest that the ESN provides a new Cache storage based on some browser storage, which means that we can keep the same Cache API but choose to use the in memory one, or the storage based one.

There are pretty good libraries which abstracts storage and support queries, one of them is dexie (IndexedDB wrapper, I already used it in a side project, pretty cool to use). You can use whatever you want, but please do not write your own serializer (ie do not JSON.stringify stuff yourself, this is time consuming and useless), use a library which does things for you.

chibenwa commented 3 years ago

Please specify the calls that we want to cache...

Anyway we should plan a way to review the usefulness of such a cache, and closely monitor hit rates. I imagine we need applicative metrics to monitor this...

Also, a cache partially solves the problem upon search. Suppose that I search bt that auto-completes in btellier and btallendier. Now imagine that only btellier is cached, and not btallendier. Reads to the People API will be needed to get the missing search results. The only thing that can be done is optimistically return the hits we know for sure first. Which is not possible with an HTTP transport mechanism.

To be fair, a good idea would be streaming the results (eg over webSocket). Each sources of the people API would stream its search results at its own pace, and we would only need to add a search result provider for the PEOPLE CACHE. That way, we would no longer be impacted by the slowest result

I suggest that the ESN provides a new Cache storage based on some browser storage, which means that we can keep the same Cache API but choose to use the in memory one, or the storage based one.

:+1:

Arsnael commented 3 years ago

@chibenwa : I think initially @vincent-zurczak seemed quite clear on the scope of the issue. By search, he meant the retrieve of people's profiles to match for example the author email with his name and avatar that you can see when in your message list in inbox, this part: screenshot-openpaas linagora com-2021 05 04-10_18_26

He is not talking about the search search function :)

Which means we have the mail author, we know what user it is, it should be easy to cache and retrieve instead of constantly calling the API, which is what Inbox seems to do for now.

chibenwa commented 3 years ago

True but I remember the audit pin-pointed also the people search API

Arsnael commented 3 years ago

True but I remember the audit pin-pointed also the people search API

That should be an other issue I believe then, as it is much more complex as you explained above

vincent-zurczak commented 3 years ago

There are several actions undertaken to improve the people API. We even discussed its removal or replacement. Fore sure, a streaming approach would be better and a final target. But the deadline is not so far and there are lots of things to do. The biggest issue is who is responsible for this API? Inbox? Calendar?

Right now, it is used by both. And LinShare and Twake could also be users of this API. Streaming goes along with a (server-side) cache approach, because even with streaming, we would not handle the load IMO. That would be for feature team stamped "common services".

Like @Arsnael said, this issue is about improving user information display (users are retrieved one by one), not searching users by general criteria. We are not redesigning this API at this stage. :)

chamerling commented 3 years ago

OK, we do not speak about search here, we speak about the People resolve endpoint (https://host:port/api/people/resolve/emailaddress/someone@linagora.com) which is called for each element of the email list. Such call can be easily cached, because we do not care if the user changed its avatar and if we update its avatar in the email list some hours/days after.

chibenwa commented 3 years ago

Then I completly buy

I suggest that the ESN provides a new Cache storage based on some browser storage, which means that we can keep the same Cache API but choose to use the in memory one, or the storage based one.

There are pretty good libraries which abstracts storage and support queries, one of them is dexie (IndexedDB wrapper, I already used it in a side project, pretty cool to use). You can use whatever you want, but please do not write your own serializer (ie do not JSON.stringify stuff yourself, this is time consuming and useless), use a library which does things for you.

Sorry for mixing topics @chamerling , I only saw the other ticket after...