neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.28k stars 408 forks source link

Use `Timeline::get_vectored` for doing timestamp lookups in SLRUs #7755

Open VladLazar opened 3 months ago

VladLazar commented 3 months ago

Issue

Mapping timestamps to LSN is slow for tenats with large SLRU counts.

The slowness comes from repeatedly querying SLRUs without using vectored get. On a tenant with 190 slru blocks this took like 4 seconds for each round of the binary search.

Timeline::map_all_timestamps should use Timeline::get_vectored instead of fanning out one Timeline::get for each block. Make sure to batch the calls into get_vectored up to the limit (32 or 64 I don't recall).

Steps to reproduce

Import tenant 751897498d4ed3c54a6a28fcbce888b8 on a throaway ec2 instance (i used i3en.3xlarge) and do:

$ time curl 127.0.0.1:9898/v1/tenant/751897498d4ed3c54a6a28fcbce888b8/timeline/4e51c228ffc33537db5cb42c7c3ebbcb/get_lsn_by_timestamp?timestamp=2024-05-10T00:00:00Z

real    1m24.795s

Now apply this diff in order to see how much time is being spent in looking up slrus:

diff --git a/pageserver/src/pgdatadir_mapping.rs b/pageserver/src/pgdatadir_mapping.rs
index 1092d64d3..e8e0a28fb 100644
--- a/pageserver/src/pgdatadir_mapping.rs
+++ b/pageserver/src/pgdatadir_mapping.rs
@@ -33,6 +33,7 @@ use serde::{Deserialize, Serialize};
 use std::collections::{hash_map, HashMap, HashSet};
 use std::ops::ControlFlow;
 use std::ops::Range;
+use std::time::Instant;
 use strum::IntoEnumIterator;
 use tokio_util::sync::CancellationToken;
 use tracing::{debug, trace, warn};
@@ -422,6 +423,8 @@ impl Timeline {
             // cannot overflow, high and low are both smaller than u64::MAX / 2
             let mid = (high + low) / 2;

+            let timer = Instant::now();
+
             let cmp = self
                 .is_latest_commit_timestamp_ge_than(
                     search_timestamp,
@@ -432,6 +435,15 @@ impl Timeline {
                 )
                 .await?;

+            let elapsed = timer.elapsed();
+            tracing::info!(
+                "FIND_LSN: high={} low={} mid={} slru_search_duration={}ms",
+                high,
+                low,
+                mid,
+                elapsed.as_millis()
+            );
+
             if cmp {
                 high = mid;
             } else {

In my repro i had:

...
2024-05-14T13:31:09.649748Z  INFO request{method=GET path=/v1/tenant/751897498d4ed3c54a6a28fcbce888b8/timeline/4e51c228ffc33537db5cb42c7c3ebbcb/get_lsn_by_timestamp request_id=7bb1d9c9-b743-4035-a7d8-e5e043f2890a}: FIND_LSN: high=261830080 low=261829569 mid=261829824 slru_search_duration=2871ms
2024-05-14T13:31:13.825731Z  INFO request{method=GET path=/v1/tenant/751897498d4ed3c54a6a28fcbce888b8/timeline/4e51c228ffc33537db5cb42c7c3ebbcb/get_lsn_by_timestamp request_id=7bb1d9c9-b743-4035-a7d8-e5e043f2890a}: FIND_LSN: high=261829824 low=261829569 mid=261829696 slru_search_duration=4175ms
2024-05-14T13:31:17.967303Z  INFO request{method=GET path=/v1/tenant/751897498d4ed3c54a6a28fcbce888b8/timeline/4e51c228ffc33537db5cb42c7c3ebbcb/get_lsn_by_timestamp request_id=7bb1d9c9-b743-4035-a7d8-e5e043f2890a}: FIND_LSN: high=261829824 low=261829697 mid=261829760 slru_search_duration=4141ms
...

Note: this used the non-vectored read path. If the get_impl="vectored" config is used, it comes out to about 13s.

Related:

jcsp commented 3 months ago

@VladLazar is this issue still relevant now that we're using vectored get by default everywhere, or is the LSN->timestamp lookup still doing its own thing?

VladLazar commented 3 months ago

@VladLazar is this issue still relevant now that we're using vectored get by default everywhere, or is the LSN->timestamp lookup still doing its own thing?

Yes, it is. As per the ticket, using the vectored read path resulted in a 10x-ish improvement in the repro, but we are still fetching each key individually. The get_vectored interface should be used instead.