simonw / datasette-socrata

Import data from Socrata into Datasette
Apache License 2.0
0 stars 1 forks source link

Don't hard-code the database / support multiple databases #5

Closed simonw closed 2 years ago

simonw commented 2 years ago

Fix this TODO: https://github.com/simonw/datasette-socrata/blob/e203c181371563e15256cb8b7d7b9df2a09e9216/datasette_socrata/__init__.py#L124

I'm going to have a "datasette-socrata": {"database": "..."} plugin configuration option for this. If it's not set then the first mutable database that's not _internal will be used.

simonw commented 2 years ago

Actually if you don't specify a specific database then the plugin will allow you to chose which database you want interactively.

simonw commented 2 years ago

If you try and import data into more than one database, you'll get multiple socrata_imports tables. I think that's OK.

Problem: that table is created on startup at the moment. I'll change the code to create it on-demand when you run an import if it doesn't exist yet.

simonw commented 2 years ago

From testing in #1 it looks like this plugin only works against databases running in WAL mode. Rather than switch the DB into WAL mode automatically, I'm going to show an error if the selected database is NOT in WAL mode.

simonw commented 2 years ago

I already had this in the HTML, copied from datasette-import-tables:

https://github.com/simonw/datasette-socrata/blob/74b75fb3b82b26c8c51ca1f8ae8b0723a8e829d0/datasette_socrata/templates/datasette_socrata.html#L35-L41

simonw commented 2 years ago

I'm getting a WEIRD error when I try to run this against two WAL databases at once:

(datasette-socrata) datasette-socrata % echo '{"id": 1, "name": "Cleo"}' | sqlite-utils insert data2.db creatures -
(datasette-socrata) datasette-socrata % echo '{"id": 1, "name": "Cleo"}' | sqlite-utils insert data.db creatures - 
(datasette-socrata) datasette-socrata % sqlite-utils enable-wal data.db data2.db
(datasette-socrata) datasette-socrata % sqlite-utils rows data.db creatures
[{"id": 1, "name": "Cleo"}]
(datasette-socrata) datasette-socrata % sqlite-utils rows data2.db creatures
[{"id": 1, "name": "Cleo"}]
(datasette-socrata) datasette-socrata % datasette data.db data2.db --root
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/bin/datasette", line 8, in <module>
    sys.exit(cli())
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/datasette/cli.py", line 585, in serve
    asyncio.get_event_loop().run_until_complete(check_databases(ds))
  File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/datasette/cli.py", line 630, in check_databases
    await database.execute_fn(check_connection)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/datasette/database.py", line 188, in execute_fn
    return await asyncio.get_event_loop().run_in_executor(
  File "/Users/simon/.pyenv/versions/3.10.3/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/datasette/database.py", line 186, in in_thread
    return fn(conn)
  File "/Users/simon/.local/share/virtualenvs/datasette-socrata-ocvQiLza/lib/python3.10/site-packages/datasette/utils/__init__.py", line 923, in check_connection
    for r in conn.execute(
sqlite3.OperationalError: unable to open database file
simonw commented 2 years ago

sqlite-utils query can query from two WAL databases at once:

% sqlite-utils data.db 'select * from creatures union all select * from data2.creatures' --attach data2 data2.db
[{"id": 1, "name": "Cleo"},
 {"id": 1, "name": "Cleo"}]
simonw commented 2 years ago

Weird: moving the create table if not exists from the startup() hook to the import_socrata() view function fixed that sqlite3.OperationalError: unable to open database file error.

simonw commented 2 years ago

sqlite3.OperationalError: unable to open database file

Found a fix for that described in this comment:

Checking journal mode on a read-only connection seems to trigger that error, so check with execute_write_fn() instead.