neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
13.68k stars 385 forks source link

pageserver: avoidable Arc::<PersistentLayerDesc>::clone #8306

Open koivunej opened 2 weeks ago

koivunej commented 2 weeks ago

Within the LayerMap, we store Arc<PersistentLayerDesc>. In all places, we iterate the existing layers by calling: https://github.com/neondatabase/neon/blob/27fe7f8963e5227d24cdd56aab419fa973dba369/pageserver/src/tenant/layer_map/historic_layer_coverage.rs#L501-L510

This results in incrementing the reference counters needlessly, because we rarely hold on to a collection of Arc<PersistentLayerDesc> instead the descriptor is used to call LayerManager::get_from_desc, which would be just as fine to do with &Arc<PersistentLayerDesc>.

Since rdps stores all entries in a strong-count-only triomphe::Arc, it might be worthwhile to see if the Arc<PersistentLayerDesc> is needed. Perhaps this could be verified using a counting allocator with existing LayerMap benchmarks and clearing the layer map in the test by removing the lowest Lsn values the first (worst case for rebuilds).

koivunej commented 2 weeks ago

Since rdps stores all entries in a strong-count-only triomphe::Arc, it might be worthwhile to see if the Arc is needed.

Additionally noted that often times it is completely unnecessary to iterate over the LayerMap just to do a HashMap lookup on "LayerFileManager". The reason why we have the two is long gone because of struct Layer and duplicate L1s, we should look into removing the hashmap.

It might not be possible to remove LayerMap requirement of storing something clonable for the rebuilding requirement. As such, using struct Layer directly in LayerMap could side-step the issue of Arc<PersistentLayerDesc>.