simonw / airtable-export

Export Airtable data to YAML, JSON or SQLite files on disk
https://datasette.io/tools/airtable-export
Apache License 2.0
109 stars 15 forks source link

Make stuff easier to view in Datasette #26

Open simonw opened 10 months ago

simonw commented 10 months ago

The exported SQLite database opened in Datasette looks like this:

CleanShot 2023-09-17 at 08 25 28@2x

Originally I was going to try to turn these into many-to-many table records, but Datasette isn't actually great at displaying those yet (and Airtable really does favour many-to-many over single-to-many).

But I think there are some tricks I can pull off with the render_cell plugin hook which can get most of what I want here with a minimal amount of work.

I'm going to play around with this and see if I can get it to work.

simonw commented 10 months ago

Useful SQL query:

with tables_with_airtable_id as (
select
  sqlite_master.name
from
  pragma_table_xinfo(sqlite_master.name) cols,
  sqlite_master
where
  sqlite_master.type = 'table'
  and cols.name = 'airtable_id'
)
select tables_with_airtable_id.name, json_group_array(cols.name)
from tables_with_airtable_id, pragma_table_xinfo(tables_with_airtable_id.name) cols
group by tables_with_airtable_id.name
name json_group_array(cols.name)
Country ["airtable_id","Name","Organisations","Flag","Other names","airtable_createdTime","People","Visited","Interactions","Tags","Opportunities","Events"]
simonw commented 10 months ago

Tried this, which creates a table which knows about all the Airtable tables across the DBs:

from datasette import hookimpl
import json

AIRTABLE_REFS_CREATE_SQL = """
create table airtable_refs (
    database text,
    table_name text,
    columns text,
    primary key (database, table_name)
)
"""

AIRTABLE_TABLES_SQL = """
with tables_with_airtable_id as (
  select
    sqlite_master.name
  from
    pragma_table_xinfo(sqlite_master.name) cols,
    sqlite_master
  where
    sqlite_master.type = 'table'
    and cols.name = 'airtable_id'
)
select
  tables_with_airtable_id.name,
  json_group_array(cols.name) as columns
from
  tables_with_airtable_id,
  pragma_table_xinfo(tables_with_airtable_id.name) cols
group by
  tables_with_airtable_id.name
"""

@hookimpl
def startup(datasette):
    # On startup, create an in-memory DB for resolving Airtable record IDs
    airtable_refs_db = datasette.add_memory_database("airtable_refs")
    async def inner():
        await airtable_refs_db.execute_write(AIRTABLE_REFS_CREATE_SQL)
        # For each database in Datasette, fetch details of all tables that
        # have an airtable_id column
        for db_name, db in datasette.databases.items():
            if db_name == "airtable_refs":
                continue
            tables = await db.execute(AIRTABLE_TABLES_SQL)
            for table_name, columns in tables.rows:
                await airtable_refs_db.execute_write(
                    "insert or replace into airtable_refs values (:db, :table, :columns)",
                    {
                        "db": db_name,
                        "table": table_name,
                        "columns": columns,
                    }
                )

    return inner

But that's not what I actually want. I want a table like this:

airtable_id database table label
rec2345 pottery Glazes Red mist

For the label I'll pull from the Name column first, then the Notes column, otherwise I guess I'll set it to the airtable_id for lack of a better option.