neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.81k stars 431 forks source link

GiST sorted build does not work correctly (PG14) #3169

Open MMeent opened 1 year ago

MMeent commented 1 year ago

Steps to reproduce

Build a GiST index in PG14 using the sorted index build method

Expected result

Index is immediately useable

Actual result

The Index may fail

Environment

PG14 (maybe: PG15 too)

Logs, links

There is no SetLastWrittenLSNForBlock()-call for the non-metapage pages in the sorted build path:

https://github.com/neondatabase/postgres/blob/4c869e4dda320758f52e28914e8361a09a80ba4a/src/backend/access/gist/gistbuild.c#L260-L283

https://github.com/neondatabase/postgres/blob/4c869e4dda320758f52e28914e8361a09a80ba4a/src/backend/access/gist/gistbuild.c#L406-L496

Pages are logged here:

https://github.com/neondatabase/postgres/blob/4c869e4dda320758f52e28914e8361a09a80ba4a/src/backend/access/gist/gistbuild.c#L592-L623

knizhnik commented 1 year ago

Can you elaborate please a little bit more why do you think that last written lsn is not updated for such pages: The pages are written using smgrextend: https://github.com/neondatabase/postgres/blob/4c869e4dda320758f52e28914e8361a09a80ba4a/src/backend/access/gist/gistbuild.c#L592-L623 and neon_extend is calling neon_wallog_page which in turn calls SetLastWrittenLSNForBlock

MMeent commented 1 year ago

Yes it updates the LwLsnCache, but it sets the LwLSN to the value of the page's Lsn, which is GistBuildLSN (= 1) and not a semantically correct value, which then will result in bogus getpage@lsn-requests (Lsn=1 implies "at any point in time past the first WAL byte", which is obviously false)

knizhnik commented 1 year ago

I see. How do you plan to fix it? Looks like the simplest way is to perform wal logging before relation extension and use current WAL LSN as page LSN, i.e:

    if (RelationNeedsWAL(state->indexrel))
        log_newpages(&state->indexrel->rd_node, MAIN_FORKNUM, state->ready_num_pages,
                     state->ready_blknos, state->ready_pages, true);
        for (int i = 0; i < state->ready_num_pages; i++)
    {
        Page        page = state->ready_pages[i];
        BlockNumber blkno = state->ready_blknos[i];

        /* Currently, the blocks must be buffered in order. */
        if (blkno != state->pages_written)
            elog(ERROR, "unexpected block number to flush GiST sorting build");

        PageSetLSN(page, GetInsertRecPtr());
        PageSetChecksumInplace(page, blkno);
        smgrextend(state->indexrel->rd_smgr, MAIN_FORKNUM, blkno, page, true);

        state->pages_written++;
    }