coralproject / talk

A better commenting experience from Vox Media
https://coralproject.net
Other
1.88k stars 355 forks source link

Caching strategy doesn't work well with many comments (no pagination) #4613

Closed ShittyAdvice closed 3 months ago

ShittyAdvice commented 3 months ago

Currently we have an article that's always open en gets updated daily, because of this the article has over 33k comments.

Expected behavior: An article that is cached should still do some form of pagination. With a cached article with many comments it should not just load and render every single comment. Preferably the caching strategy should implement some form of pagination. That way the client doesn't need any changes and can reuse the Load more comments button to trigger this functionality.

I'd expect there to be at least some client side pagination done even when every comment is served straight from the cache, but preferably the pagination happens before that so that the comments are stored in smaller 'buckets' and we don't waste so much bandwidth.

Actual behavior: When an article is cached all comments are always loaded. With my example that means the browser ends up rendering all 33k comments when the embed is loaded. This makes the request to the client slow due to the amount of bandwidth needed, but also is a heavy strain on our Redis instance. Even though we have more than enough capacity and are not throttling in memory or cpu, pulling all these 33k comments out of Redis every time the article is loaded by someone is causing so much throughput and bandwidth that the average response time of the Redis instance spikes to be multiple minutes.

Related Issues: No related issues found

Versions:

tessalt commented 3 months ago

Coral isn't really designed to support so many comments on a single story, the largest comment streams we test the software on are about 5k comments and we expect longer load times at that point. I would recommend maybe using the API to close and modify the story record (call updateStory to modify the story URL to something else), then coral will automatically create a new story for the URL and you can start fresh. You could do this maybe every day or every week or however long it takes to accumulate 2-5k comments.

ShittyAdvice commented 3 months ago

That does explain why you won't change the strategy or methods, but how about the client side pagination? You say you test it for up to 5k comments, however loading and rendering 5k comments all at once is still both a lot of data, but also too much for many (slower) devices to handle.

ShittyAdvice commented 3 months ago

Regardless I do believe it's worth investing in something like that in the long term. Is there a specific reason you choose, as a platform, to not support larger comment threads at all?

We also have many articles that are not exceeding these limits and within a few hundred comments and are not caching the larger ones, but we do see the average request times and throughput of Redis increases drastically the more capacity we use despite having a lot of memory and cpu still available. Almost as if it always loads everything and there for gets slower the more it has stored.

Assuming I find the time to dive into those issues deeper and refactor this myself doing something among the lines of storing comments in smaller buckets to be coupled with the "load more comments" button or whatever works best with the current implementation, would you consider implementing it if I opened a PR for it?

That would benefit your application on both lower spec devices and larger scale while helping us keep our internal fork cleaner and a lot easier to maintain :)

tessalt commented 3 months ago

Unfortunately we don't have much time to dedicate to this project at the moment so I can't guarantee we'll be able to review PRs in a timely manner especially if core functionality is modified, but we will do our best. The caching and pagination strategies we've developed is based on the demands of the publications we support, it may not be the best strategy for all use cases.