cyrildewit / eloquent-viewable

Associate views with Eloquent models in Laravel
MIT License
828 stars 105 forks source link

Delete records from views table with millions of records #300

Open Rattone opened 3 months ago

Rattone commented 3 months ago

In one project, +6 million views were reached in less than 2 years, resulting in a table over 1GB. What is the best way to store, for example in some table field, visits older than a year and then delete them?

cyrildewit commented 3 months ago

Hey @Rattone,

Wow that's an impressive amount of views. Approximately 8.000 views per day (if not mistaken).

When I created this package I didn't had real use-case and therefore never experienced performance issues on a large scale. I did think about this problem for some time, but never persuaded with implementing strategies to decrease the views table after a period of time. As you already suggest, after a period of time we could remove the views and store the total number for the desired resolutions: per day, per month etc. This would of course decrease the precision of the views counts, but that's the accepted trade off.

This package does not provide a solution out of the box, but you could defintely implement a pruning strategy yourself. The complexity will probably lay in making sure that the views count still make sense after combining the archived views count and the actual count.

What ideas do you have on your mind? You know your application context better than I do. If you are not using the views count query options extensively, then the solution to your problem could be easier than what I'm thinking 😄

Rattone commented 3 months ago

@cyrildewit I did some checks today, many of the accesses are bots, about half I think. In the last period, the number of records tracked compared to those marked on Google Analytics has increased. I haven't investigated these bots in depth but there are days when in a few minutes I receive hundreds of visits on different posts from the same user, with even 10 posts opened in the same second.

As for the prune strategy, I solved with a UPDATE SELECT query to update an "archived_views" field with COUNT(DISTINCT(visitor)) and then a DELETE query. I only keep the details of the last 6 months, so I might need some statistics. For total_unique_views I have a scheduled task that adds archived_views and the number of unique visits from the views table. This way the table remains under 1GB :-D

Rattone commented 3 months ago

an example image