projectnessie / nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics
https://projectnessie.org
Apache License 2.0
1.04k stars 130 forks source link
aws-lambda data git iceberg java spark

Project Nessie

Project Nessie is a Transactional Catalog for Data Lakes with Git-like semantics.

Zulip Group Discussion Twitter Website

Maven Central PyPI quay.io Docker Artifact Hub Swagger Hub

Build Status Query Engines CI macOS Build

More information can be found at projectnessie.org.

Nessie supports Iceberg Tables/Views. Additionally, Nessie is focused on working with the widest range of tools possible, which can be seen in the feature matrix.

Using Nessie

You can quickly get started with Nessie by using our small, fast docker image.

IMPORTANT NOTE Nessie has moved away from docker.io to GitHub's container registry ghcr.io, and also quay.io. Recent releases are already only available on both ghcr.io and quay.io. Please update references to projectnessie/nessie in your code to either ghcr.io/projectnessie/nessie or quay.io/projectnessie/nessie.

docker pull ghcr.io/projectnessie/nessie
docker run -p 19120:19120 ghcr.io/projectnessie/nessie

For trying Nessie image with different configuration options, refer to the templates under the docker module.

A local Web UI will be available at this point.

Then install the Nessie CLI tool (to learn more about CLI tool and how to use it, check Nessie CLI Documentation).

pip install pynessie

From there, you can use one of our technology integrations such those for

To learn more about all supported integrations and tools, check here

Have fun! We have a Google Group and a Slack channel we use for both developers and users. Check them out here.

Authentication

By default, Nessie servers run with authentication disabled and all requests are processed under the "anonymous" user identity.

Nessie supports bearer tokens and uses OpenID Connect for validating them.

Authentication can be enabled by setting the following Quarkus properties:

Experimenting with Nessie Authentication in Docker

One can start the projectnessie/nessie docker image in authenticated mode by setting the properties mentioned above via docker environment variables. For example:

docker run -p 19120:19120 \
  -e QUARKUS_OIDC_CLIENT_ID=<Client ID> \
  -e QUARKUS_OIDC_AUTH_SERVER_URL=<OpenID Server URL> \
  -e NESSIE_SERVER_AUTHENTICATION_ENABLED=true \
  --network host \
  ghcr.io/projectnessie/nessie

Building and Developing Nessie

Requirements

Installation

Clone this repository:

git clone https://github.com/projectnessie/nessie
cd nessie

Then open the project in IntelliJ or Eclipse, or just use the IDEs to clone this github repository.

Refer to CONTRIBUTING for build instructions.

Compatibility

Nessie Iceberg's integration is compatible with Iceberg as in the following table:

Nessie version Iceberg version Spark version (Scala 2.12+2.13) Hive version Flink version Presto version Trino version
0.100.2 1.5.0 3.3.x, 3.4.x, 3.5.x n/a 1.16.x, 1.17.x, 1.18.x 0.277, 0.278.x, 0.279, 0.280, 0.281 419

Distribution

To run:

  1. configuration in servers/quarkus-server/src/main/resources/application.properties
  2. execute ./gradlew :nessie-quarkus:assemble && java -jar servers/quarkus-server/build/quarkus-app/quarkus-run.jar
  3. go to http://localhost:19120

UI

Nessie UI sources have moved to their own repository: https://github.com/projectnessie/nessie-ui.

Docker image

Official Nessie images are built with support for multiplatform builds. But to quickly build a docker image for testing purposes, simply run the following command:

./gradlew :nessie-quarkus:clean :nessie-quarkus:quarkusBuild
docker build -f ./tools/dockerbuild/docker/Dockerfile-server -t nessie-unstable:latest ./servers/quarkus-server 

Check that your image is available locally:

docker images

You should see something like this:

REPOSITORY       TAG     IMAGE ID       CREATED          SIZE
nessie-unstable  latest  24bb4c7bd696   15 seconds ago   555MB

Once this is done you can run your image with docker run -p 19120:19120 quay.io/nessie-unstable:latest, passing the relevant environment variables, if any. Environment variables names must follow MicroProfile Config's mapping rules.

Nessie related repositories

Contributing

Code Style

The Nessie project uses the Google Java Code Style, scalafmt and pep8. See CONTRIBUTING.md for more information.

Acknowledgements

See ACKNOWLEDGEMENTS.md