ApsaraDB / PolarDB-for-PostgreSQL

A cloud-native database based on PostgreSQL developed by Alibaba Cloud.
https://apsaradb.github.io/PolarDB-for-PostgreSQL/zh/
Apache License 2.0
2.86k stars 457 forks source link

[Question] why vm buffer is not even read into memory in read only nodes not even once? #509

Closed srinathv2 closed 2 months ago

polardb-bot[bot] commented 3 months ago

Hi @srinathv2 ~ Thanks for opening this issue! 🎉

Please make sure you have provided enough information for subsequent discussion.

We will get back to you as soon as possible. ❤️

mrdrivingduck commented 2 months ago

@srinathv2 I guess what you are asking is why VM pages were not used during an IndexOnlyScan in replica node. In the current version, the primary can flush the newest version of VM pages to the disk. As a result, replica nodes might read "future" pages that they have not yet replayed, potentially leading to incorrect results for IndexOnlyScan based on VM pages. Therefore, IndexOnlyScan on replica nodes does not use VM bits and instead directly reads from the heap table for visibility checks. However, in the latest version, we have removed this limitation, and under certain conditions, replica nodes can also quickly access data based on VM pages.

srinathv2 commented 2 months ago

@mrdrivingduck afaik vm pages in replica are only used for index only scans,cause vacuum process wont run, if you are not using it then why you need the buffertagequals filters in redo apis ,cause anyway vm pages are not being called in memory.Please correct me if i am wrong.

Ccxikka commented 2 months ago

@srinathv2 If a replica reads a VM page that is a "future" page,we will not use it to ensure correctness; however, if the page_lsn of the VM page read by the replica is less than its replay_lsn,then the replica can normally use the VM page because it isn't a future page. Moreover, a replica might be promoted to become the new primary, so it is necessary to replay the xlog of VM pages to maintain consistency of the replica's VM data with the primary.