datasette / datasette-enrichments

Tools for running enrichments against data stored in Datasette
https://enrichments.datasette.io
Apache License 2.0
19 stars 0 forks source link

Document API key patterns #45

Closed simonw closed 5 months ago

simonw commented 5 months ago

Two patterns to document:

Plus that bit where the API key shouldn't be stored in the database for fear of it leaking by accident.

datasette-enrichments-gpt and datasette-enrichments-opencage both implement that pattern already.

New documentation should go here: https://enrichments.datasette.io/en/stable/developing.html

simonw commented 5 months ago

Here's how datasette-enrichments-gpt does that at the moment: https://github.com/datasette/datasette-enrichments-gpt/blob/1353a46f08f73e9b18a58115d74730fdab7427ad/datasette_enrichments_gpt/__init__.py#L32-L116

    async def get_config_form(self, datasette, db, table):
        class ConfigForm(Form):
            # Select box to pick model from gpt-3.5-turbo or gpt-4-turbo
            model = SelectField(
                "Model",
                choices=[
                    ("gpt-3.5-turbo", "gpt-3.5-turbo"),
                    ("gpt-4-turbo", "gpt-4-turbo"),
                    ("gpt-4-vision", "gpt-4-turbo vision"),
                ],
                default="gpt-3.5-turbo",
            )
            # ... more stuff like that

        def stash_api_key(form, field):
            if not (field.data or "").startswith("sk-"):
                raise ValidationError("API key must start with sk-")
            if not hasattr(datasette, "_enrichments_gpt_stashed_keys"):
                datasette._enrichments_gpt_stashed_keys = {}
            key = secrets.token_urlsafe(16)
            datasette._enrichments_gpt_stashed_keys[key] = field.data
            field.data = key

        class ConfigFormWithKey(ConfigForm):
            api_key = PasswordField(
                "API key",
                description="Your OpenAI API key",
                validators=[
                    DataRequired(message="API key is required."),
                    stash_api_key,
                ],
                render_kw={"autocomplete": "off"},
            )

        plugin_config = datasette.plugin_config("datasette-enrichments-gpt") or {}
        api_key = plugin_config.get("api_key")

        return ConfigForm if api_key else ConfigFormWithKey

Then it accesses that stashed key here:

https://github.com/datasette/datasette-enrichments-gpt/blob/1353a46f08f73e9b18a58115d74730fdab7427ad/datasette_enrichments_gpt/__init__.py#L229-L244

def resolve_api_key(datasette, config):
    plugin_config = datasette.plugin_config("datasette-enrichments-gpt") or {}
    api_key = plugin_config.get("api_key")
    if api_key:
        return api_key
    # Look for it in config
    api_key_name = config.get("api_key")
    if not api_key_name:
        raise ApiKeyError("No API key reference found in config")
    # Look it up in the stash
    if not hasattr(datasette, "_enrichments_gpt_stashed_keys"):
        raise ApiKeyError("No API key stash found")
    stashed_keys = datasette._enrichments_gpt_stashed_keys
    if api_key_name not in stashed_keys:
        raise ApiKeyError("No API key found in stash for {}".format(api_key_name))
    return stashed_keys[api_key_name]

A utility function to make this pattern easier to implement would be a good idea.

simonw commented 5 months ago

This is already handled in: