neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
13.18k stars 367 forks source link
database postgres postgresql rust serverless

Neon

Neon

Neon is a serverless open-source alternative to AWS Aurora Postgres. It separates storage and compute and substitutes the PostgreSQL storage layer by redistributing data across a cluster of nodes.

Quick start

Try the Neon Free Tier to create a serverless Postgres instance. Then connect to it with your preferred Postgres client (psql, dbeaver, etc) or use the online SQL Editor. See Connect from any application for connection instructions.

Alternatively, compile and run the project locally.

Architecture overview

A Neon installation consists of compute nodes and the Neon storage engine. Compute nodes are stateless PostgreSQL nodes backed by the Neon storage engine.

The Neon storage engine consists of two major components:

See developer documentation in SUMMARY.md for more information.

Running local installation

Installing dependencies on Linux

  1. Install build dependencies and other applicable packages

Building Neon requires 3.15+ version of protoc (protobuf-compiler). If your distribution provides an older version, you can install a newer version from here.

  1. Install Rust
    # recommended approach from https://www.rust-lang.org/tools/install
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Installing dependencies on macOS (12.3.1)

  1. Install XCode and dependencies
    
    xcode-select --install
    brew install protobuf openssl flex bison icu4c pkg-config

add openssl to PATH, required for ed25519 keys generation in neon_local

echo 'export PATH="$(brew --prefix openssl)/bin:$PATH"' >> ~/.zshrc


2. [Install Rust](https://www.rust-lang.org/tools/install)

recommended approach from https://www.rust-lang.org/tools/install

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh


3. Install PostgreSQL Client

from https://stackoverflow.com/questions/44654216/correct-way-to-install-psql-without-full-postgres-on-macos

brew install libpq brew link --force libpq


#### Rustc version

The project uses [rust toolchain file](./rust-toolchain.toml) to define the version it's built with in CI for testing and local builds.

This file is automatically picked up by [`rustup`](https://rust-lang.github.io/rustup/overrides.html#the-toolchain-file) that installs (if absent) and uses the toolchain version pinned in the file.

rustup users who want to build with another toolchain can use the [`rustup override`](https://rust-lang.github.io/rustup/overrides.html#directory-overrides) command to set a specific toolchain for the project's directory.

non-rustup users most probably are not getting the same toolchain automatically from the file, so are responsible to manually verify that their toolchain matches the version in the file.
Newer rustc versions most probably will work fine, yet older ones might not be supported due to some new features used by the project or the crates.

#### Building on Linux

1. Build neon and patched postgres

Note: The path to the neon sources can not contain a space.

git clone --recursive https://github.com/neondatabase/neon.git cd neon

The preferred and default is to make a debug build. This will create a

demonstrably slower build than a release build. For a release build,

use "BUILD_TYPE=release make -jnproc -s"

Remove -s for the verbose build log

make -jnproc -s


#### Building on OSX

1. Build neon and patched postgres

Note: The path to the neon sources can not contain a space.

git clone --recursive https://github.com/neondatabase/neon.git cd neon

The preferred and default is to make a debug build. This will create a

demonstrably slower build than a release build. For a release build,

use "BUILD_TYPE=release make -jsysctl -n hw.logicalcpu -s"

Remove -s for the verbose build log

make -jsysctl -n hw.logicalcpu -s


#### Dependency installation notes
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `pg_install/bin` and `pg_install/lib`, respectively.

To run the integration tests or Python scripts (not required to use the code), install
Python (3.9 or higher), and install the python3 packages using `./scripts/pysync` (requires [poetry>=1.3](https://python-poetry.org/)) in the project directory.

#### Running neon database
1. Start pageserver and postgres on top of it (should be called from repo root):
```sh
# Create repository in .neon with proper paths to binaries and data
# Later that would be responsibility of a package install script
> cargo neon init
Initializing pageserver node 1 at '127.0.0.1:64000' in ".neon"

# start pageserver, safekeeper, and broker for their intercommunication
> cargo neon start
Starting neon broker at 127.0.0.1:50051.
storage_broker started, pid: 2918372
Starting pageserver node 1 at '127.0.0.1:64000' in ".neon".
pageserver started, pid: 2918386
Starting safekeeper at '127.0.0.1:5454' in '.neon/safekeepers/sk1'.
safekeeper 1 started, pid: 2918437

# create initial tenant and use it as a default for every future neon_local invocation
> cargo neon tenant create --set-default
tenant 9ef87a5bf0d92544f6fafeeb3239695c successfully created on the pageserver
Created an initial timeline 'de200bd42b49cc1814412c7e592dd6e9' at Lsn 0/16B5A50 for tenant: 9ef87a5bf0d92544f6fafeeb3239695c
Setting tenant 9ef87a5bf0d92544f6fafeeb3239695c as a default one

# create postgres compute node
> cargo neon endpoint create main

# start postgres compute node
> cargo neon endpoint start main
Starting new endpoint main (PostgreSQL v14) on timeline de200bd42b49cc1814412c7e592dd6e9 ...
Starting postgres at 'postgresql://cloud_admin@127.0.0.1:55432/postgres'

# check list of running postgres instances
> cargo neon endpoint list
 ENDPOINT  ADDRESS          TIMELINE                          BRANCH NAME  LSN        STATUS
 main      127.0.0.1:55432  de200bd42b49cc1814412c7e592dd6e9  main         0/16B5BA8  running
  1. Now, it is possible to connect to postgres and run some queries:

    > psql -p 55432 -h 127.0.0.1 -U cloud_admin postgres
    postgres=# CREATE TABLE t(key int primary key, value text);
    CREATE TABLE
    postgres=# insert into t values(1,1);
    INSERT 0 1
    postgres=# select * from t;
    key | value
    -----+-------
    1 | 1
    (1 row)
  2. And create branches and run postgres on them:

    
    # create branch named migration_check
    > cargo neon timeline branch --branch-name migration_check
    Created timeline 'b3b863fa45fa9e57e615f9f2d944e601' at Lsn 0/16F9A00 for tenant: 9ef87a5bf0d92544f6fafeeb3239695c. Ancestor timeline: 'main'

check branches tree

cargo neon timeline list (L) main [de200bd42b49cc1814412c7e592dd6e9] (L) ┗━ @0/16F9A00: migration_check [b3b863fa45fa9e57e615f9f2d944e601]

create postgres on that branch

cargo neon endpoint create migration_check --branch-name migration_check

start postgres on that branch

cargo neon endpoint start migration_check Starting new endpoint migration_check (PostgreSQL v14) on timeline b3b863fa45fa9e57e615f9f2d944e601 ... Starting postgres at 'postgresql://cloud_admin@127.0.0.1:55434/postgres'

check the new list of running postgres instances

cargo neon endpoint list ENDPOINT ADDRESS TIMELINE BRANCH NAME LSN STATUS main 127.0.0.1:55432 de200bd42b49cc1814412c7e592dd6e9 main 0/16F9A38 running migration_check 127.0.0.1:55434 b3b863fa45fa9e57e615f9f2d944e601 migration_check 0/16F9A70 running

this new postgres instance will have all the data from 'main' postgres,

but all modifications would not affect data in original postgres

psql -p 55434 -h 127.0.0.1 -U cloud_admin postgres postgres=# select * from t; key | value -----+------- 1 | 1 (1 row)

postgres=# insert into t values(2,2); INSERT 0 1

check that the new change doesn't affect the 'main' postgres

psql -p 55432 -h 127.0.0.1 -U cloud_admin postgres postgres=# select * from t; key | value -----+------- 1 | 1 (1 row)

  1. If you want to run tests afterwards (see below), you must stop all the running pageserver, safekeeper, and postgres instances you have just started. You can terminate them all with one command:
    > cargo neon stop

More advanced usages can be found at Control Plane and Neon Local.

Handling build failures

If you encounter errors during setting up the initial tenant, it's best to stop everything (cargo neon stop) and remove the .neon directory. Then fix the problems, and start the setup again.

Running tests

Rust unit tests

We are using cargo-nextest to run the tests in Github Workflows. Some crates do not support running plain cargo test anymore, prefer cargo nextest run instead. You can install cargo-nextest with cargo install cargo-nextest.

Integration tests

Ensure your dependencies are installed as described here.

git clone --recursive https://github.com/neondatabase/neon.git

CARGO_BUILD_FLAGS="--features=testing" make

./scripts/pytest

By default, this runs both debug and release modes, and all supported postgres versions. When testing locally, it is convenient to run just one set of permutations, like this:

DEFAULT_PG_VERSION=15 BUILD_TYPE=release ./scripts/pytest

Flamegraphs

You may find yourself in need of flamegraphs for software in this repository. You can use flamegraph-rs or the original flamegraph.pl. Your choice!

[!IMPORTANT] If you're using lld or mold, you need the --no-rosegment linker argument. It's a general thing with Rust / lld / mold, not specific to this repository. See this PR for further instructions.

Cleanup

For cleaning up the source tree from build artifacts, run make clean in the source directory.

For removing every artifact from build and configure steps, run make distclean, and also consider removing the cargo binaries in the target directory, as well as the database in the .neon directory. Note that removing the .neon directory will remove your database, with all data in it. You have been warned!

Documentation

docs Contains a top-level overview of all available markdown documentation.

To view your rustdoc documentation in a browser, try running cargo doc --no-deps --open

See also README files in some source directories, and rustdoc style documentation comments.

Other resources:

Postgres-specific terms

Due to Neon's very close relation with PostgreSQL internals, numerous specific terms are used. The same applies to certain spelling: i.e. we use MB to denote 1024 * 1024 bytes, while MiB would be technically more correct, it's inconsistent with what PostgreSQL code and its documentation use.

To get more familiar with this aspect, refer to:

Join the development