simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.08k stars 649 forks source link

datasette package --spatialite throws error during build #1900

Open rdmurphy opened 1 year ago

rdmurphy commented 1 year ago

Hello! Attempting to use datasette package to bundle up a SpatiaLite DB and I'm getting this error during the docker build:

sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory

Seems to be throwing when this step is ran:

ERROR [6/6] RUN datasette inspect results.db --inspect-file inspect-data.json

This is with v0.63.1.

simonw commented 1 year ago

Which Docker image are you using here? It looks like it's missing SpatiaLite from the image.

simonw commented 1 year ago

Oh this is with datasette package? That should work. Will investigate.

simonw commented 1 year ago

Trying this out locally with this 69MB SpatiaLite file I happened to have lying around (from testing shapefile-to-sqlite a while ago): https://static.simonwillison.net/static/2022/nps-spatialite.db

% datasette package nps-spatialite.db --spatialite
...
 => [2/6] COPY . /app                                                                                                                    0.4s
 => [3/6] WORKDIR /app                                                                                                                   0.0s
 => [4/6] RUN apt-get update &&     apt-get install -y python3-dev gcc libsqlite3-mod-spatialite &&     rm -rf /var/lib/apt/lists/*     29.6s
 => [5/6] RUN pip install -U datasette                                                                                                  12.0s
 => [6/6] RUN datasette inspect nps-spatialite.db --inspect-file inspect-data.json                                                       2.6s 
 => exporting to image                                                                                                                   3.0s 
 => => exporting layers                                                                                                                  3.0s 
 => => writing image sha256:4dfef1c373c5c057ef7ac22344f834d522acef24313a1b25d2eba9e500066b8f                                             0.0s 

And then:

docker run -p 8001:8001 4dfef1c373c5

This worked fine for me. I ran datasette package using Datasette 0.63.1.

simonw commented 1 year ago

Did you use the --spatialite option?

I just tried this:

datasette package nps-spatialite.db

It built the image OK (I didn't see the error you reported), but running the container failed with an error:

/tmp % docker run -p 8001:8001 7298e8e6bbfb
Usage: datasette serve [OPTIONS] [FILES]...
Try 'datasette serve --help' for help.

Error: It looks like you're trying to load a SpatiaLite database without first loading the SpatiaLite module.

Read more: https://docs.datasette.io/en/stable/spatialite.html
simonw commented 1 year ago

Could you provide full steps to reproduce plus a SpatiaLite database file that triggered this for you? I'm not able to recreate the problem.

rdmurphy commented 1 year ago

Interesting! So I tried this locally using your copy of nps-spatialite.db and I got the same error. 🤔

❯ datasette package nps-spatialite.db --spatialite
[+] Building 27.5s (10/10) FINISHED                                                                                                                                                        
 => [internal] load build definition from Dockerfile                                                                                                                                  0.0s
 => => transferring dockerfile: 622B                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/library/python:3.11.0-slim-bullseye                                                                                                        0.9s
 => [internal] load build context                                                                                                                                                     2.3s
 => => transferring context: 72.38MB                                                                                                                                                  2.3s
 => CACHED [1/6] FROM docker.io/library/python:3.11.0-slim-bullseye@sha256:1cd45c5dad845af18d71745c017325725dc979571c1bbe625b67e6051533716c                                           0.0s
 => [2/6] COPY . /app                                                                                                                                                                 0.1s
 => [3/6] WORKDIR /app                                                                                                                                                                0.0s
 => [4/6] RUN apt-get update &&     apt-get install -y python3-dev gcc libsqlite3-mod-spatialite &&     rm -rf /var/lib/apt/lists/*                                                  18.5s
 => [5/6] RUN pip install -U datasette                                                                                                                                                4.9s
 => ERROR [6/6] RUN datasette inspect nps-spatialite.db --inspect-file inspect-data.json                                                                                              0.7s 
------                                                                                                                                                                                     
 > [6/6] RUN datasette inspect nps-spatialite.db --inspect-file inspect-data.json:                                                                                                         
#10 0.681 Traceback (most recent call last):                                                                                                                                               
#10 0.681   File "/usr/local/bin/datasette", line 8, in <module>                                                                                                                           
#10 0.681     sys.exit(cli())                                                                                                                                                              
#10 0.681              ^^^^^
#10 0.681   File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
#10 0.682     return self.main(*args, **kwargs)
#10 0.682            ^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.682   File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1055, in main
#10 0.682     rv = self.invoke(ctx)
#10 0.682          ^^^^^^^^^^^^^^^^
#10 0.682   File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
#10 0.682     return _process_result(sub_ctx.command.invoke(sub_ctx))
#10 0.682                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.682   File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
#10 0.682     return ctx.invoke(self.callback, **ctx.params)
#10 0.682            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.682   File "/usr/local/lib/python3.11/site-packages/click/core.py", line 760, in invoke
#10 0.682     return __callback(*args, **kwargs)
#10 0.682            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/cli.py", line 164, in inspect
#10 0.683     inspect_data = loop.run_until_complete(inspect_(files, sqlite_extensions))
#10 0.683                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 650, in run_until_complete
#10 0.683     return future.result()
#10 0.683            ^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/cli.py", line 179, in inspect_
#10 0.683     counts = await database.table_counts(limit=3600 * 1000)
#10 0.683              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 304, in table_counts
#10 0.683     for table in await self.table_names():
#10 0.683                  ^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 342, in table_names
#10 0.683     results = await self.execute(
#10 0.683               ^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 267, in execute
#10 0.683     results = await self.execute_fn(sql_operation_in_thread)
#10 0.683               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 213, in execute_fn
#10 0.683     return await asyncio.get_event_loop().run_in_executor(
#10 0.683            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/concurrent/futures/thread.py", line 58, in run
#10 0.683     result = self.fn(*self.args, **self.kwargs)
#10 0.683              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/database.py", line 209, in in_thread
#10 0.683     self.ds._prepare_connection(conn, self.name)
#10 0.683   File "/usr/local/lib/python3.11/site-packages/datasette/app.py", line 593, in _prepare_connection
#10 0.683     conn.execute("SELECT load_extension(?)", [extension])
#10 0.683 sqlite3.OperationalError: /usr/lib/x86_64-linux-gnu/mod_spatialite.so.so: cannot open shared object file: No such file or directory
------
executor failed running [/bin/sh -c datasette inspect nps-spatialite.db --inspect-file inspect-data.json]: exit code: 1
simonw commented 1 year ago

This is so weird! What version of Datasette do you get from datasette --version there - and what's your Docker version / operating system version?

simonw commented 1 year ago

You get:

 => [internal] load metadata for docker.io/library/python:3.11.0-slim-bullseye                                                                                                        0.9s
 => [internal] load build context                                                                                                                                                     2.3s
 => => transferring context: 72.38MB                                                                                                                                                  2.3s
 => CACHED [1/6] FROM docker.io/library/python:3.11.0-slim-bullseye@sha256:1cd45c5dad845af18d71745c017325725dc979571c1bbe625b67e6051533716c                                           0.0s

I get:

 => [internal] load metadata for docker.io/library/python:3.11.0-slim-bullseye                                                                                                                         1.0s
 => [internal] load build context                                                                                                                                                                      0.0s
 => => transferring context: 705B                                                                                                                                                                      0.0s
 => CACHED [1/6] FROM docker.io/library/python:3.11.0-slim-bullseye@sha256:1cd45c5dad845af18d71745c017325725dc979571c1bbe625b67e6051533716c                                                            0.0s

Both the image name and the hash are exactly the same. So why are you getting an error while mine works OK?

For my machine:

~ % docker --version
Docker version 20.10.12, build e91ed57

~ % uname -a
Darwin Simons-MacBook-Pro-2.local 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64 x86_64
rdmurphy commented 1 year ago

Is it, uh, possible we are on different architectures? 😅 I'm using an Apple M1 Pro.

I jumped into a bash shell of an unmodified python:3.11.0-slim-bullseye container and manually ran apt-get update and installed libsqlite3-mod-spatialite. I don't end up with with mod_spatialite.so in /usr/lib/x86_64-linux-gnu/mine is in /usr/lib/aarch64-linux-gnu/.

I swapped that directory in here in a local copy of datasette and poof — it worked!

rdmurphy commented 1 year ago

Can confirm that my uname -a returns something different at the end:

root:xnu-8792.41.9~2/RELEASE_ARM64_T6000 arm64

I'm in arm64 land, you're in x86_64. I am admittedly very fuzzy on how this factors into Docker these days. Honestly thought this was one of the things Docker was suppose to help address. 🤔

rdmurphy commented 1 year ago

Okay, my final observations for the night! I've been pushing and pulling the various levers in utils/__init__.py to see what makes this work without hard-coding in something for arm64 and it seems that if I change /usr/lib/x86_64-linux-gnu/mod_spatialite.so here to just mod_spatialite it's happy.

Unfortunately cannot audit that for x86_64, but maybe that's a solution that'd be cross-arch compatible? It seems like it's the hard-coding of that path that's tripping it up.

(It was actually this comment from back in 2018 in an entirely unrelated repo that nudged me to try this, ha.)

rdmurphy commented 3 months ago

If you can believe it I just hit this all over again. 😅