denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
96.56k stars 5.33k forks source link

Allow a depth limit parameter to the KV list interator #23275

Open inverted-capital opened 6 months ago

inverted-capital commented 6 months ago

In specifying options for list there is no way of specifying the maximum depth of keys that I want returned.

If I am modelling a filesystem in Deno KV, if I want to list the root directory, I currently get the whole filesystem. That gets quite heavy on Deno Deploy, and quite slow if my DB is in Virginia, but my edge isolate is in Australia.

Missing this parameter can cause some subtle holes in libraries, particularly when users are allowed to specify their own file names. One such example is here: https://github.com/kitsonk/kv-toolbox/issues/12

Using start and end ranges still returns all the keys in between regardless of the depth of the key array, so I'd propose the depth parameter apply to both prefix and range specifiers.

losfair commented 4 months ago

Deno KV stores all keys in a flat, ordered keyspace. It's possible to add specialized logic for "auto-skipping" in a kv.list() operation - but it's suboptimal: in the worst case it has to visit the same number of keys as a full scan, and actually is slower than a full scan due to the extra round-trips needed.

To model a filesystem, you might want to design your keyspace in a similar way as the FoundationDB directory layer: use a single-level key prefix for each directory, and store "pointers" inside. For example:

/a: ["id0", "a"] -> {"type": "directory", "ref": "id1"}
/a/b: ["id1", "b"] -> {"type": "file", "content": "someContent"}
/a/c: ["id1", "c"] -> {"type": "file", "content": "someContent"}
inverted-capital commented 4 months ago

Thank you - from your insight, I will store the directory path as one part of the key and the filename as the other, allowing focused retrieval scoped to a single directory:

[ "/a" ] -> true // a single key represents a directory
[ "/a", "b" ] -> Uint8Array // raw contents of file at /a/b
[ "/a", "c" ] -> Uint8Array // raw contents of file at /a/c

This way avoids having to store the file contents as a data structure and allows storing it as raw bytes, skipping a transformation.