Closed simonw closed 1 year ago
Documentation: https://platform.openai.com/docs/plugins/introduction
Mainly needs to provide a /.well-known/ai-plugin.json
that points to a OpenAPI specification.
I got ChatGPT to build me this OpenAPI specification:
openapi: 3.0.0
info:
title: Datasette API
version: 1.0.0
paths:
/fixtures.json:
get:
summary: Retrieve data from the facetable
description: This endpoint allows users to execute SQL queries on the facetable and retrieve the results as an array of JSON objects.
parameters:
- name: sql
in: query
description: The SQL query to be executed on the facetable.
required: true
schema:
type: string
- name: _shape
in: query
description: The shape of the response data. Set to "array" to receive an array of JSON objects.
required: true
schema:
type: string
enum:
- array
responses:
'200':
description: Successful operation
content:
application/json:
schema:
type: array
items:
type: object
'400':
description: Bad request
'500':
description: Internal server error
From this prompt:
Write an OpenAPI schema explaining the https://latest.datasette.io/fixtures.json?sql=select+*+from+facetable&_shape=array GET API which accepts SQL and returns an array of JSON objects
Initial prototype (in plugins/chatgpt.py
on Glitch):
from datasette import hookimpl, Response
SCHEMA = """
openapi: 3.0.0
info:
title: Datasette API
version: 1.0.0
paths:
/fixtures.json:
get:
summary: Execute SQL against the fixtures database
description: This endpoint allows users to execute SQLite SQL queries and retrieve the results as an array of JSON objects.
parameters:
- name: sql
in: query
description: The SQL query to be executed
required: true
schema:
type: string
- name: _shape
in: query
description: The shape of the response data. Must be "array"
required: true
schema:
type: string
enum:
- array
responses:
'200':
description: Successful SQL results
content:
application/json:
schema:
type: array
items:
type: object
'400':
description: Bad request
'500':
description: Internal server error
"""
AI_PLUGIN = {
"schema_version": "v1",
"name_for_model": "sql-fixtures",
"name_for_human": "SQL against fixtures",
"description_for_model": "Run SQL queries against the fixtures database.",
"description_for_human": "Run SQL against Datasette.",
"api": {
"type": "openapi",
"url": "https://cedar-antique-cosmos.glitch.me/openapi-schema.yml",
"has_user_authentication": False
},
"contact_email": "hello@contact.com",
"legal_info_url": "hello@legal.com"
}
async def ai_plugin(request):
return Response.json(AI_PLUGIN)
async def openapi_schema(request):
return Response.text(SCHEMA)
@hookimpl
def register_routes():
return [
(r"^/\.well-known/ai-plugin\.json$", ai_plugin),
(r"^/openapi-schema.yml$", openapi_schema),
]
Fixed it:
AI_PLUGIN = {
"schema_version": "v1",
"name_for_model": "sql_fixtures",
"name_for_human": "SQL against fixtures",
"description_for_model": "Run SQL queries against the fixtures database.",
"description_for_human": "Run SQL against Datasette.",
"auth": {
"type": "user_http",
"authorization_type": "bearer"
},
"api": {
"type": "openapi",
"url": "https://cedar-antique-cosmos.glitch.me/openapi-schema.yml",
"has_user_authentication": False
},
"logo_url": "https://avatars.githubusercontent.com/u/126964132?s=400&u=08b2ed680144a4feb421308f09e5f3cc5876211a&v=4",
"contact_email": "hello@contact.com",
"legal_info_url": "hello@legal.com"
}
I'd like to tell it it doesn't need an auth token.
I can't figure out why I'm getting that UnrecognizedFunctionError
. The server logs don't show any hits to my API.
Found more documentation: https://platform.openai.com/docs/plugins/getting-started/plugin-manifest - visible on desktop but not I think on mobile.
I'm trying to include the schema, but I think I broke it:
from datasette import hookimpl, Response
import textwrap
PROMPT = """
Run SQLite queries against the content SQL database.
Use 'select * from sqlite_master' to see the list of tables.
Use 'select * from pragma_table_info("name_of_table")' to see the columns in a table
NEVER use 'PRAGMA table_info' or any other PRAGMA - always use 'pragma_table_info()' and suchlike instead
"""
def make_openapi_schema(prompt):
return """
openapi: 3.0.1
info:
title: Datasette fixtures plugin
description: A plugin that allows ChatGPT to run SQL queries against a Datasette SQLite database
version: 'v1'
servers:
- url: https://cedar-antique-cosmos.glitch.me
paths:
/content.json:
get:
operationId: query
summary: Execute a SQLite SQL query against the content database
description: |-
{}
parameters:
- name: sql
in: query
description: The SQL query to be executed
required: true
schema:
type: string
- name: _shape
in: query
description: The shape of the response data. Must be "array"
required: true
schema:
type: string
enum:
- array
responses:
'200':
description: Successful SQL results
content:
application/json:
schema:
type: array
items:
type: object
'400':
description: Bad request
'500':
description: Internal server error
""".format(textwrap.indent(prompt, ' '))
def ai_plugin_schema(extra_prompt):
return {
"schema_version": "v1",
"name_for_model": "datasette_content",
"name_for_human": "SQL against content",
"description_for_model": PROMPT,
"description_for_human": "Run SQL against content database in Datasette.",
"auth": {
"type": "none"
},
"api": {
"type": "openapi",
"url": "https://cedar-antique-cosmos.glitch.me/openapi-schema.yml",
"has_user_authentication": False
},
"logo_url": "https://avatars.githubusercontent.com/u/126964132?s=400&u=08b2ed680144a4feb421308f09e5f3cc5876211a&v=4",
"contact_email": "hello@contact.com",
"legal_info_url": "hello@legal.com"
}
async def ai_plugin():
return Response.json(ai_plugin_schema(""))
async def build_schema(datasette):
database = [value for key, value in datasette.databases.items() if not key.startswith("_")][0]
return (await database.execute("select group_concat(sql, ';') from sqlite_master")).single_value()
async def openapi_schema(datasette):
return Response.text(make_openapi_schema(await build_schema(datasette)))
@hookimpl
def register_routes():
return [
(r"^/\.well-known/ai-plugin\.json$", ai_plugin),
(r"^/openapi-schema.yml$", openapi_schema),
]
Tried uninstalling and reinstalling the plugin and got this:
Oh dear it hallucinated!
Despite that query returning the correct results:
My current working plugin is:
from datasette import hookimpl, Response
import textwrap
PROMPT = """
Run SQLite queries against the content SQL database.
Use 'select * from sqlite_master' to see the list of tables.
Use 'select * from pragma_table_info("name_of_table")' to see the columns in a table
NEVER use 'PRAGMA table_info' or any other PRAGMA - always use 'pragma_table_info()' and suchlike instead
"""
def make_openapi_schema(prompt):
return """
openapi: 3.0.1
info:
title: Datasette fixtures plugin
description: A plugin that allows ChatGPT to run SQL queries against a Datasette SQLite database
version: 'v1'
servers:
- url: https://cedar-antique-cosmos.glitch.me
paths:
/content.json:
get:
operationId: query
summary: Execute a SQLite SQL query against the content database
description: |-
{}
parameters:
- name: sql
in: query
description: The SQL query to be executed
required: true
schema:
type: string
- name: _shape
in: query
description: The shape of the response data. Must be "array"
required: true
schema:
type: string
enum:
- array
responses:
'200':
description: Successful SQL results
content:
application/json:
schema:
type: array
items:
type: object
'400':
description: Bad request
'500':
description: Internal server error
""".format(textwrap.indent(prompt, ' '))
def ai_plugin_schema(extra_prompt):
return {
"schema_version": "v1",
"name_for_model": "datasette_content",
"name_for_human": "SQL against content",
"description_for_model": PROMPT + ' ' + extra_prompt,
"description_for_human": "Run SQL against content database in Datasette.",
"auth": {
"type": "none"
},
"api": {
"type": "openapi",
"url": "https://cedar-antique-cosmos.glitch.me/openapi-schema.yml",
"has_user_authentication": False
},
"logo_url": "https://avatars.githubusercontent.com/u/126964132?s=400&u=08b2ed680144a4feb421308f09e5f3cc5876211a&v=4",
"contact_email": "hello@contact.com",
"legal_info_url": "hello@legal.com"
}
async def ai_plugin(datasette):
return Response.json(ai_plugin_schema(await available_tables(datasette)))
async def build_schema(datasette):
database = [value for key, value in datasette.databases.items() if not key.startswith("_")][0]
return (await database.execute("select group_concat(sql, ';') from sqlite_master")).single_value()
async def available_tables(datasette):
database = [value for key, value in datasette.databases.items() if not key.startswith("_")][0]
return (await database.execute("select group_concat(name, ', ') from sqlite_master where type = 'table'")).single_value()
async def openapi_schema():
return Response.text(make_openapi_schema("Use SQLite syntax"))
@hookimpl
def register_routes():
return [
(r"^/\.well-known/ai-plugin\.json$", ai_plugin),
(r"^/openapi-schema.yml$", openapi_schema),
]
The schema proved too large so I'm sending a list of tables instead.
OK, this prototype is working well enough that I'm going to release it as an alpha.
Schema discovery still isn't great - it keeps trying to run those PRAGMA commands. Maybe I should give it an explicit API for that, or perhaps enable PRAGMA table_info since it's so keen to use that.
https://platform.openai.com/docs/plugins/getting-started/plugin-manifest limits:
- 50 character max for
name_for_human
- 50 character max for
name_for_model
- 120 character max for
description_for_human
- 8000 character max just for
description_for_model
(will decrease over time)
It's not clear to me how important name_for_model
is. Since this Datasette plugin could be installed on many instances I don't think it should be the same for each one, and it looks like hyphens are not allowed but underscores are.
So maybe datasette_latest_datasette_io
would be a good name, using the server hostname with underscores?
What if that's longer than 50 characters?
>>> len("datasette_global_power_plants_datasettes_com")
44
>>> len("datasette_congress_legislators_datasettes.com")
45
So I'm already getting close with some of my longer named examples.
I'm tempted to go with datasette_as_much_as_possible_hexhash
- ending in a 6 character hash of the whole value. That should give me good uniqueness without busting the length limit.
Removing this code since I'm not calling it:
async def build_schema(datasette):
database = [
value for key, value in datasette.databases.items() if not key.startswith("_")
][0]
return (
await database.execute("select group_concat(sql, ';') from sqlite_master")
).single_value()
async def available_tables(datasette):
database = [
value for key, value in datasette.databases.items() if not key.startswith("_")
][0]
return (
await database.execute(
"select group_concat(name, ', ') from sqlite_master where type = 'table'"
)
).single_value()
I don't see OpenAI checking for updates to my ai-plugin.json
file - so I'm going to try a fresh deploy with a new URL, since I also can't see how to uninstall and reinstall a plugin to trigger that refresh.
Tried deploying a new Glitch instance and got this:
But I caught it doing a fetch()
to an API that failed to return my new plugin:
Figured that out:
You need to hit "Develop your own plugin" and not "Install an unverified plugin".
OK! The latest code deployed fresh does indeed seem to work - and doesn't try to run the PRAGMA table_info
things any more so it seems to be obeying the latest prompt.
Blogged it here: https://simonwillison.net/2023/Mar/24/datasette-chatgpt-plugin/
Goal is to have a plugin that turns any Datasette instance into a ChatGPT plugin - providing instructions for ChatGPT to execute SQL queries against it.