hikeratlas / website

The https://www.hikeratlas.com website
1 stars 0 forks source link

add search w/autosuggest #3

Closed cldellow closed 7 months ago

cldellow commented 7 months ago

As the map gets more real, I want to jump around:

cldellow commented 7 months ago

D1 may be a bad match for this. I think .scanstats on shows the metrics that Cloudflare bills from.

With a small DB (covering only the Rockies), an autosuggest query for l scans almost 8,000 rows:

sqlite> select * from items where rowid in (select rowid from fts where name match 'l*' and qrank > 10  order by qrank desc limit 3)  limit 5;
+--------+----------+-----------+------+--------------------+---------+------------------+---------------+------------------+-----------------+
| qrank  | wikidata |  osm_id   | kind |        park        | country |      state       |     name      |       lon        |       lat       |
+--------+----------+-----------+------+--------------------+---------+------------------+---------------+------------------+-----------------+
| 168134 | Q1072342 | 356877446 | lake | Banff NP           | Canada  | Alberta          | Moraine Lake  | -116.18703158091 | 51.320164759327 |
| 134300 | Q1367398 | 357009018 | lake | Banff NP           | Canada  | Alberta          | Lake Louise   | -116.22614575044 | 51.413030464112 |
| 72339  | Q525072  | 2903374   | lake | Knox Mountain Park | Canada  | British Columbia | Okanagan Lake | -119.54409659387 | 49.819667496935 |
+--------+----------+-----------+------+--------------------+---------+------------------+---------------+------------------+-----------------+
QUERY PLAN (cycles=37197177 [100%])
|--LIST SUBQUERY 1
|  |--SCAN fts VIRTUAL TABLE INDEX 0:M0                 (cycles=36238477 [97%] loops=1 rows=7447)
|  `--USE TEMP B-TREE FOR ORDER BY                      (cycles=149493 [0%] loops=1 rows=230)
`--SEARCH items USING INTEGER PRIMARY KEY (rowid=?)     (cycles=13482 [0%] loops=1 rows=3)
Run Time: real 0.014 user 0.013753 sys 0.000000

If we scaled it to the globe, we'd be looking at perhaps 800K rows scanned. (Presumably, we'd have a tiered approach for very short queries, so this isn't quite fair.)

Still, it seems like not a great match - I think D1 is best for OLTP-esque queries that need to update a dataset.

Our case has a baked dataset, so I think we'd be well-served just publishing a Lambda function URL.

Let's start on D1, but be open to moving off if the cost structure doesn't work.

cldellow commented 7 months ago

...actually, the dev tooling around D1 seems quite skewed towards a real, live DB manipulated via SQL statements and migrations, whereas I think I want the ability to replace DBs wholesale.

Ugh, drafting a Cloudformation doesn't sound fun...

cldellow commented 7 months ago

Screw Cloudformation, let's just have a goal of having such a dead simple deploy that a human can do it in the UI in 30 seconds if needed. Single file, no deps.