simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.41k stars 671 forks source link

Datasette row URLs break for binary data as a primary key #2419

Open simonw opened 2 weeks ago

simonw commented 2 weeks ago

Spotted this while browsing a FTS table:

CleanShot 2024-09-03 at 14 33 52@2x

Schema is:

CREATE TABLE 'releases_fts_idx'(
  segid,
  term,
  pgno,
  PRIMARY KEY(segid, term)
) WITHOUT ROWID;

Got a 404 error when I clicked the link to e.g. /content/releases_fts_idx/3,b~270thei~27

simonw commented 2 weeks ago

I think the bigger problem here is that a row page with a binary primary key cannot even be linked to.

Maybe need some kind of way of representing hex() in a URL to a row page?

simonw commented 2 weeks ago

Relevant code:

https://github.com/simonw/datasette/blob/2170269258d1de38f4e518aa3e55e6b3ed202841/datasette/views/row.py#L17-L21

Which calls:

https://github.com/simonw/datasette/blob/2170269258d1de38f4e518aa3e55e6b3ed202841/datasette/app.py#L1643-L1651

Which calls:

https://github.com/simonw/datasette/blob/2170269258d1de38f4e518aa3e55e6b3ed202841/datasette/utils/__init__.py#L1234-L1246

asg017 commented 2 weeks ago

Sometimes people store UUIDs or ULIDs as compact BLOBs instead of their more verbose TEXT format.

(don't really see much value in that personally but I guess it's something to consider)