singer-io / getting-started

This repository is a getting started guide to Singer.
https://singer.io
1.25k stars 148 forks source link

Credentials Provider as part of the Spec #93

Open RenanBasilio opened 1 month ago

RenanBasilio commented 1 month ago

I've been looking into Singer for implementing some ETL pipelines, and one issue that I've encountered rather consistently is that there's no way to configure tap or target credentials programmatically.

Access credentials are often stored externally where they can be secured or rotated. Because these are provided to Singer through the use of local config-type files, this often means repackaging and redeploying pipelines whenever such an event happens that changes the credentials.

This also makes services that make use of OpenAuth or other authentication frameworks tricky to manage, as re-authenticating often can mean generating a new token which would then have to be re-bundled with all singer pipelines.

In order to address this, I'd like to propose the addition of credential providers as part of the spec. This would be in the form of a callable interface which individual taps can call upon in order to retrieve credentials that they might need for establishing a connection, in a common format.

Individual providers could be installed into the environment along with the tap or target; then, when executing the tap/target, the user could pass the provider name as an additional parameter or in the config file. The tap or target could then call a generic get_credentials method in order to retrieve credentials for use connecting to the service, rather than reading the credentials from the config file, and expect a predetermined set of keys in the returned object.

Example Use Case

For an example use case, consider a set of username and password for a service, stored in AWS Secrets Manager under a secret ID of my-service-secret. The config file could look like

{
  "credentials_provider": "aws_secrets_manager",
  "credentials_provider_args": {
    "secret_id": "my-service-secret"
  }
}

A tap for my-service that is initialized with that configuration would then begin like so:

from singer import utils
from my_service import connection

creds = utils.get_credentials()
conn = connection.connect(username=creds['username'], password=creds['password'])

The get_credentials() method would then be responsible for loading the requested provider from the environment (emitting an error if it does not exist), calling it using the parameters in the config, and returning its output to the calling application.