fybrik / airbyte-module

A FybrikModule based on Airbyte
Apache License 2.0
3 stars 6 forks source link

License

airbyte-module

The airbyte-module (ABM) for Fybrik is a FybrikModule which makes use of Airbyte connectors.

ABM is both an Apache Arrow Flight and an HTTP server.

What is Airbyte?

Airbyte is a data integration tool that focuses on extracting and loading data.

Airbyte has a vast catalog of connectors that support dozens of data sources and data destinations. These Airbyte connectors run in docker containers and are built in accordance with the Airbyte specification.

What is the Airbyte Module?

ABM is an arrow-flight server that enables applications to consume tabular data from a wide range of data sources.

Since Airbyte connectors are implemented as docker images and run as docker containers, the Airbyte Module does not require Airbyte as a prerequisite. To run the Airbyte Module, only docker is required.

How to run the Airbyte Module server locally

Follow the instructions in the sample folder.

How to deploy the Airbyte Module to kubernetes using helm

Follow the instructions in the helm folder.

How a Fybrik Application can access a dataset, using an Airbyte FybrikModule

If you would like to run a use case where the application has unrestricted access to a dataset, follow the instructions here.

However, if you are interested in a use case where the governance policies mandate that some of the dataset columns must be redacted, follow the instructions here. In this scenario, both the airbyte module and the arrow-flight-module are deployed. The airbyte module reads the dataset, whereas the arrow-flight-module transforms the dataset based on the governance policies.