Fetch everything during a read in one go.

replikativ / konserve-jdbc

A JDBC backend for konserve.

Eclipse Public License 2.0

2 stars 4 forks source link

Fetch everything during a read in one go. #9

Open whilo opened 1 year ago

whilo commented 1 year ago

At the moment header, metadata and value are fetched sequentially, when needed, but this could be done at once to reduce latency by inspecting the :operation in the env, e.g. https://github.com/replikativ/konserve/blob/e2c1cb45708006a62a1df2133261620a2b70c3c8/src/konserve/impl/defaults.cljc#L400.

whilo commented 1 year ago

konserve-s3 already fetches everything at once https://github.com/replikativ/konserve-s3/blob/1d8a512f93765739557c5412788161a0f323ad58/src/konserve_s3/core.clj#L161, but there it would also be reasonable to consider :operation and pick what to fetch (and also do a range request to not fetch very large blobs, but only the first megabyte or so, if only metadata is needed).

alekcz commented 1 year ago

In JDBC we're just pulling the required column. header, meta, and data are all separate columns. So data isn't fetched unnecessarily. @whilo are you comfortable for me to close this?

whilo commented 1 year ago

Yes, we don't fetch redundant data, but we know already from :operation which columns we need when we fetch the header, e.g. header and metadata for https://github.com/replikativ/konserve/blob/main/src/konserve/impl/defaults.cljc#L333. Having to do multiple round trips to the SQL server increases latency accordingly (3x at the moment). In konserve-s3 I decided to always fetch everything because for Datahike that is fine (except for GC which only reads metadata) https://github.com/replikativ/konserve-s3/blob/main/src/konserve_s3/core.clj#L170, but ideally I should also dispatch on the :operation.

whilo commented 1 year ago

It is enough to distinguish the :operation :read-meta which does not need the value (so fetch header and metadata at once), all others need (header, metadata and value).