OpenFn / adaptors

The new home for OpenFn adaptors; re-usable connectors for the most common DPGs and DPI building blocks.
GNU General Public License v3.0
5 stars 8 forks source link

`redis` adaptor #686

Closed josephjclark closed 1 month ago

josephjclark commented 1 month ago

A design for a minimal redis adaptor

Overview

get(key)
hget(key, field)
hgetAll(key, field)
set(key, value) // value can be an object and _replaces_ the existing entry at the key
hset(key, field, value) 
scan(pattern, options) // keys returned to state.data

command(commandName, args) // generic call to client.sendCommand, allowing arbitrary command execution. Results are written to state.data.

Auth

Basic and oath?

Implementation notes

We should use the node-redis client

~See @redis/search for the search API - I suggest we just wrap this. search should return the documents array to state.data (don't return the meta wrapper)~

josephjclark commented 1 month ago

I suspect we should use pipelining or transactions to optimise large setters. Ideally this should be under-the-good but we may need users to tell us when to start and finish.

I'm researching those topics at the moment - but they are an optimisation.

Apparently the node redis client does autopiplining and will pipeline all requests in the same tick. See the docs. We should think about leveraging this (although next tick likely won't work because we have a very asynchronously runtime)

josephjclark commented 1 month ago

We should be careful not to over-optimise. Maybe we should just put out simple API above and see how well that performs. If upserts are slow we can can work out how to optimize

josephjclark commented 1 month ago

Because redis's API is quite big, we could maybe have a exec() operation which sends a raw redis command up to the server. The node client has a sendCommand API to do just this.

mtuchi commented 1 month ago

For Auth: It will be similar to other database.

redis[s]://[[username][:password]@][host][:port][/db-number] (see redis and rediss IANA registration for more details)

createClient({
  url: 'redis://alice:foobared@awesome.redis.server:6380'
});
Redis Configuration Schema

``` { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Redis Database Connection Schema", "type": "object", "properties": { "host": { "type": "string", "description": "The hostname or IP address of the Redis server.", "pattern": "^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$|^([a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,}$" }, "port": { "type": "integer", "description": "The port number on which the Redis server is listening.", "minimum": 1, "maximum": 65535, "default": 6379 }, "username": { "type": "string", "description": "The username for authenticating with the Redis server.", "minLength": 1 }, "password": { "type": "string", "description": "The password for authenticating with the Redis server.", "minLength": 1 }, "dbNumber": { "type": "integer", "description": "The database number to connect to on the Redis server.", "minimum": 0 } }, "required": ["host"], "additionalProperties": false } ```

mtuchi commented 1 month ago

For testing locally use, setup a redis server using docker πŸ‘‡πŸ½

docker run -p 6379:6379 -it redis/redis-stack-server:latest
mtuchi commented 1 month ago

set(key,value), Hiya @josephjclark looking at the redis-readme. It looks like set should take 3 params

client.hSet('key', 'field', 'value');
//Using js object
await client.set('key', 'value', {
  EX: 10,
  NX: true
});
mtuchi commented 1 month ago

@josephjclark just a heads up, i don't think we can unit test these new functions 😒. I don't know how i can setup a mock client for redis

josephjclark commented 1 month ago

I don't know about that third argument @mtuchi - that's just a list of options to drive the set. Stuff like timeout and upsert control. Maybe we can expose an options object later but my feeling is that it's a level of detail we don't need to cover. It's too low.

Which reminds me: the value in the set should support a JSON object (in which case the existing object is replaced).

josephjclark commented 1 month ago

I can help you with a mock pattern. We need something a bit like common http where we set the mock client externally

mtuchi commented 1 month ago

@josephjclark yeah that will be nice, i will be working on this tomorrow

mtuchi commented 1 month ago

EOD Updates: Loved my experience with redis so-far, the documentation is easy to understand and easy to test.

Progress

Findings

~Next Steps~ Some-Thoughts

Before i make a mess i think it's best for me and @josephjclark to sync on the design for this adaptor because we need to address my findings ☝🏽.

Based on my quick understanding of the NH 1st workflow

josephjclark commented 1 month ago

My thoughts @mtuchi (happy to take a call)

I want to do the minimal and simplest amount of work here. So:

Let me update the spec.

mtuchi commented 1 month ago

Thank you @josephjclark i will pick this up tomorrow

mtuchi commented 1 month ago

@josephjclark why callback in scan(pattern,callback, options) is the 2nd argument ?

josephjclark commented 1 month ago

New design

because redis doesn't work like I thought, I want to make this an even lower level API

We can't do useful callbacks or auto fetching in scan. And on reflection the callback won't work either.

So we should return the keys to state.data, then you can do each() on them and get or hget or hgetall or whatever you want todo

mtuchi commented 1 month ago

EOD Update: See work in progress #706

I made good progress on this issue, as per our discussion with @josephjclark hGetAll() is the last function to be implemented for this issue. The pending item for this issue are πŸ‘‡πŸ½

mtuchi commented 1 month ago

This is waiting for another round of feedback from @josephjclark See work in progress #706

mtuchi commented 1 month ago

@aleksa-krolls redis v1.0.0 is out, I have created a PR on marketing website https://github.com/OpenFn/marketing/pull/190

mtuchi commented 1 month ago

Based on @aleksa-krolls suggested next steps:

Mtuchi to investigate why we can’t use the JSON data type to minimize the manipulation needed to store the data received via webhook. (FYI see ChatGPT response on how you can use SCAN to search JSON)

Currently in nh-workfow, i am using the hash data type when importing incoming payload to DB, and query the data using hGetAll(). The challenge with hash data type requires the json object to be a string of key value pair. If the object have other data type nested inside like array or json you have to convert these values into string. So i am using JSON.stringifigy to convert an array of records into a string. @aleksa-krolls suggest if we can set the JSON data type directly instead of using HASH data type, which is possible when using client.json.set(key, $, {a,b,c}) and you can read the data using client.json.get(key). @josephjclark I would love you input on the best way to design those helpers

Something to note is that JSON data type is bigger than HASH data type. about 3x bigger with my current test data.

Mtuchi to troubleshoot why he can’t scan the TS ':20240524T012736Z (from key f9eb9cc8c26d4697bd741928719ac87c:20240524T012736Z:20240524T031736Z)

There are couple of improvements on scan() operation

josephjclark commented 1 month ago

Let's add json support like this:

jGet(key)
jSet(key,  json)

jGet and jSet (like hSet and hGet) allow json objects to be read and written.

For now, we always read/write from the root path. Maybe later we'll add a path option.

This always replaces the object at that key. This needs to be clearly documented.

josephjclark commented 1 month ago

Changes to scan():

I'd love a good unit test on the updated scan but that depends a bit on how easy it is to mock out the scan API. Maybe we can mock the iterator to return three "pages" of results?

Again, we'll likely evolve and improve this API in future

mtuchi commented 1 month ago

Thanks @josephjclark for the help, I have created two issues for redis

As for next step i will add estimate on those issue and sync with @aleksa-krolls on which sprint should we put them