citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Cache SPI plans to restore performance #153

Closed jasonmp85 closed 8 years ago

jasonmp85 commented 8 years ago

Version v1.2 introduced the use of SPI in the metadata system to better abstract differences between pg_shard and CitusDB representations of distribution metadata. We didn't initially use cached SPI plans because cursory tests showed limited wins given the added complexity.

Well, additional tests uncovered severe performance regressions when many writers are involved. This change caches each metadata function's SPI plan in a function-local static variable on first execution, using the cached plans on subsequent executions. After this change, we can restore performance to near v1.1 levels (SPI does inevitably add some overhead, no matter what).

jasonmp85 commented 8 years ago

Oh, and please don't hit the merge button here. Merging this means creating a release, and I have a local command to do merge it to master, develop, and tag and sign the release. So I'll run that after getting some eyes on this.

marcocitus commented 8 years ago

Makes sense, although the benefit seems very marginal given that we only allow single-statement transactions at the moment. :shipit:

jasonmp85 commented 8 years ago

I'm going to merge this but want to be clear about my meaning:

If a concurrent transaction adds or removes shard stuff, we want to see it right away. With read_only set to true, all the methods would use the snapshot at the beginning of the transaction, possibly resulting in stale data. With read_only set to false, you might read the partition column, shard list, and then have a racing transaction mark a placement as unhealthy right before you read it. READ COMMITTED means you'd see that unhealthiness right then, even though the commit occurred after you'd started your work (i.e. you'd already read the partition column and shard metadata). The other case, you'd still see the placement as healthy (similar to read repeatable or serializable). The locking system was designed with this immediacy and concurrency in mind, so having the data layer expose things with this isolation is best.

I guess there should be a doc about this stuff, huh? But I think we're on track to switch to caches and invalidation medium term anyhow, so ideally you'd read from a cache eventually anyways and only refresh (and not to the latest snapshot, but to what's actually committed in the table at that moment) the cache upon invalidation, which has nice performance benefits anyways.