When the event cache is full, then we should drop the latest event and keep the earliest one. Currently, we are dropping the earliest event which causes two problems
We are causing performance issues during page load when compute resources are extremely limited. For example, if there are 300 events against a default cache limit of 200 during the initial PutRumEvents batch, then RUM will execute 100x200=20,000 operations shifting out the earliest events, when resources should instead be allocated towards loading the page.
RUM users are missing important telemetries because earlier events tend to be more relevant than later ones. By keeping early events, we are more likely to capture events that attribute to page load performance, such as LCP resources.
Follow up
To ensure a good experience in the future, we need additional design to prioritize critical events in the cache, such as web vitals and session start.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Problem statement
When the event cache is full, then we should drop the latest event and keep the earliest one. Currently, we are dropping the earliest event which causes two problems
Follow up
To ensure a good experience in the future, we need additional design to prioritize critical events in the cache, such as web vitals and session start.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.