mikee47 / ConfigDB

Configuration database for Sming
GNU General Public License v3.0
3 stars 1 forks source link

Implement store sharing / locking #17

Closed mikee47 closed 3 months ago

mikee47 commented 3 months ago

This PR implements store locking as discussed in https://github.com/mikee47/ConfigDB/issues/12. Also started a test application with a Sharing module.

We now generate four types of templated Objects:

Objects are now read-only and have separate Updater types to handle writing. These call commit in their destructor, but applications can call it directly if they need to check for success/failure.

Updates must be consistent and not be reflected in other readers until committed. A web client reading the database using a streaming Reader must always receive a consistent set of data for each store. For efficiency, multiple readers share the same Store instance so when update is called on one of them then a copy of the Store is made internally. Rather than reloading from storage, a copy constructor has been added to Store which makes this operation more efficient.

The Basic_Config sample has been updated with a basic demonstration of streaming updates. This accepts a POST request with Content-Type: application/json.

Note that, as discussed, the update behaviour has been changed to 'simplest' mode:

TODO:

pljakobs commented 3 months ago

ah, now I understand the *Const idea. while I like the simplicity of a forWrite bool parameter, I think it would be more stringent if mode would be passed similar to the mode in the file open call, so maybe CONF_READ, CONF_WRITE. I like the idea of having no set or commit methods if opened CONF_READ as it would move error checking into the compiler - but I guess the implementation would be a lot more difficult then (if at all possible?).

Thinking about how we'll ideally use this in very small scopes, why offer a R/W implementation anyway and not just a read and a write object version? I will of course primarily think about my need here, but I see three major use modes:

I'm very happy to have readers only for the last case, and I think the system would be clearer if read and write would be very separate paths. For the few cases where I might read a config value to create a new config item, I think there would be a different way.

But then again: when I try to write to a file that I have opened RO, I get a runtime error, so I'd be just as happy with having to catch that as it is part of the deal anyway.

mikee47 commented 3 months ago

ah, now I understand the *Const idea. while I like the simplicity of a forWrite bool parameter, I think it would be more stringent if mode would be passed similar to the mode in the file open call, so maybe CONF_READ, > CONF_WRITE. I like the idea of having no set or commit methods if opened CONF_READ as it would move error checking into the compiler - but I guess the implementation would be a lot more difficult > then (if at all possible?).

Thinking about how we'll ideally use this in very small scopes, why offer a R/W implementation anyway and not just a read and a write object version? I will of course primarily think about my need here, but I see three major use modes:

  • streaming the whole db from/to storage/http (which should, for most web-client based systems, be the normal config method)

Not necessarily. I favour websockets for interface control as that keeps things snappy and the traffic can be minimised. I may only want to sent one or a few settings over the wire, and not the entire store. With the current arrangement I'd need to process that manually because the writer wipes each store before an update. That could do with some settings to control its behaviour. The tricky thing there is with arrays since we might want to update values or replace the entire array. Either way, I'd prefer the client sends only the changes across rather than the entire dataset. Will need some thought for that one.

NB. we don't as yet have a proper dirty flag which we'd set only if a set call actually changes anything. That would convert a commit into a no-op.

  • updating / adding single configuration items at run time - that shouldn't be a very frequent thing, or am I wrong?
  • reading values from the ConfigDB to, well, configure the system. That's probably the most common use case.

Agreed!

I'm very happy to have readers only for the last case, and I think the system would be clearer if read and write would be very separate paths. For the few cases where I might read a config value to create a new config item, I think there would be a different way.

I'm tending to agree here, although I do like the object-based updating interface.

But then again: when I try to write to a file that I have opened RO, I get a runtime error, so I'd be just as happy with having to catch that as it is part of the deal anyway.

How about this:

BasicConfig::Root::Security sec(database);
if(sec.beginUpdate()) {
    sec.setApiSecured(false);
    auto status = sec.commit();
    if(!status) {
        Serial << "Commit : " << String(status) << endl;
    }
}

The error check is made on the call to commit. Using IFS::ErrorCode means we can return IFS::Error::Denied if it's readonly. However, it's the opposite sense to bool so wrapping it in a ConfigDB::Status struct would help. The alternative is storing a lastError value somewhere but that feels a bit clunky.

If sec.setApiSecured is called outside of a transaction (beginUpdate) we get a debug error message and no change is made - no assertion I think. In release builds it would be a 'silent failure'.

In an ideal world we wouldn't see changes to sec until commit is called but that's a complication we probably don't need.

We wouldn't have Const objects but that's actually quite a complicated change. This approach is simpler and perhaps more intuitive?

mikee47 commented 3 months ago

This is doable:


BasicConfig::Root::Security sec(database);
if(auto wr = sec.beginUpdate()) {
    wr.setApiSecured(false);
    auto status = wr.commit();
    if(!status) {
        Serial << "Commit : " << String(status) << endl;
    }
}```

So `sec` has no set/commit (it's the Const idiom).
The `beginUpdate` returns a writeable version.
pljakobs commented 3 months ago

so wr would become a writeable instance of sec? This seems very clean and mostly intuitive (I would, in documentation, probably suggest a naming scheme such as sec/sec_w but that's just because it seems a good idea to keep track - alternatively, always using something like wr would make sure there's a compiler error if it's reused in an overly large scope)

The down side is that rather than just a single variable assignment when working with a global config variable, this requires eight lines and an explicit commit.

On the error reporting: Why would you return the IFS::Error::Denied only upon the commit? That should be available on beginUpdate()?

Is there a way to have the class auto-commit in the destructor (so when the scope closes) and still get the error somehow?

Thinking about app error handling - beginUpdate() will fail when there is already an existing writeable instance, right? If the app does not just want to fail but recover, it would back off for a few ms (probably a few 100ms from the testing we did) and retry. Upon commit, an error would probably mean that writing to storage has failed, so there's not a whole lot an application can do but fail. That's where essentially the class could just throw an exception - iE call the callback mentioned above.

Could both of those be moved into the class? I know you said you want to never stall the application, but maybe we can make this optional? If we pass parameters to the beginUpdate that reflect a backoff time and a retry number, the default behaviour could be to just come back with an error, but when I call sec.beginUpdate(150,2) I would deliberately be willing to wait up to 300ms. commit() could again be implicit in the destructor and since I don't see there's a recoverable error it could throw, it could just call a function that reports the error in some way.

database.setCommitCallback(&CommitCallback) //this would be a global setting

{
  BasicConfig::Root::Security sec(database);
  {
    auto wr = sec.beginUpdate(100,2);
    wr.setApiSecured(false);
  }
  ... // more code possibly reading from sec, but the write is done and committed by the destructor to wr
}

void CommitCallback(String store, ConfigDB::Status status ){
  Serial << "error: cannot commit to store " << store << " error " << String(status) <<endl
}

What I don't know is: if this one asynchrounous thread hangs - does that lock up other parts of sming?
also, if this is doable, we would also need a callback for beginUpdate fails as the way to report the error might be different between applications (I would send a ws update to the client, others might just print to console etc)

I don't like the hang either, but I do like the simplicity of the application code that it creates. If error handling has to happen centrally for every change or bunch of changes, that will potentially be quite a bit of repeated code and a lot of places to change code when the reporting method needs to change.

If we only use the CommitCallback and have beginUpdate()

database.setCommitCallback(&CommitCallback) //this would be a global setting

{
  BasicConfig::Root::Security sec(database);
  {
    if(auto wr = sec.beginUpdate()){
      wr.setApiSecured(false);
    }
    else
    {
      // error handling for beginUpdate failing here, if necessary / possible - probably, the store already has a writeable instance
    }
  } // wr goes out of scope here, destructor calls commit and, if necessary, CommitCallback 

  ... // more code possibly reading from sec, but the write is done and committed by the destructor to wr
}

void CommitCallback(String store, ConfigDB::Status status ){
  Serial << "error: cannot commit to store " << store << " error " << String(status) <<endl
}

hope this isn't too confused

mikee47 commented 3 months ago

so wr would become a writeable instance of sec? This seems very clean and mostly intuitive (I would, in documentation, probably suggest a naming scheme such as sec/sec_w but that's just because it seems a good idea to keep track > - alternatively, always using something like wr would make sure there's a compiler error if it's reused in an overly large scope)

The down side is that rather than just a single variable assignment when working with a global config variable, this requires eight lines and an explicit commit.

Yes, wr is a poor choice (implies Writer which is something else). Let's use update instead ?

So, we use an Updater class definition which can be instantiated directly:

if(auto update = BasicConfig::Root::Security::Updater(database)) {
    update.setApiSecured(false);
    update.commit();
}

On the error reporting: Why would you return the IFS::Error::Denied only upon the commit? That should be available on beginUpdate()?

beginUpdate() returns either a valid or 'invalid' Updater object, so other than OOM can fail for only one reason. It's possible to call commit on an invalid Updater so needs to return an appropriate error code, as well as reporting back any filing system error.

Is there a way to have the class auto-commit in the destructor (so when the scope closes) and still get the error somehow?

Yes, we can store a lastError value in the database; IFS objects do this - see IFS::FsBase class.

Interesting point though, do we even need commit? If so, do we need rollback? I'm thinking we can get rid of it?

Thinking about app error handling - beginUpdate() will fail when there is already an existing writeable instance, right? If the app does not just want to fail but recover, it would back off for a few ms (probably a few > 100ms from the testing we did) and retry.

Yes, but it would need to do this with a timer or via the task queue. That raises the question of where to put the data in the meantime! In this scenario, someone else is updating the store so when the application retries it will need to decide whether it's still appropriate to use the data it originally intended to write or, more likely, it needs to re-check the config to make a decision about how to proceed. i.e. who is the winner here?

Upon commit, an error would probably mean that writing to storage has failed, so there's not a whole lot an application can do but fail. That's where essentially the class could just throw an exception - iE call the callback > mentioned above.

Or it could save the data to a backup database on a different partition whilst it then goes ahead and repairs/rebuilds the filing system/database, etc.

Could both of those be moved into the class? I know you said you want to never stall the application, but maybe we can make this optional? If we pass parameters to the beginUpdate that reflect a backoff time and a retry number, the default behaviour could be to just come back with an error, but when I call sec.beginUpdate(150,2) I would deliberately be willing to wait up to 300ms. commit() could again be implicit in the destructor and since I don't see there's a recoverable error it could throw, it could just call a function that reports the error in some way.

A beginUpdate failure is going to happen when a Writer stream object is receiving data from a remote web client. This happens in chunks and can take any number of iterations of the main task loop. So stalling won't accomplish anything. If Sming had a yield capability then this might work, but it doesn't; that route is a whole world of hurt as you either get into recursion or multithreading, multiple stacks, etc, etc.

... What I don't know is: if this one asynchrounous thread hangs - does that lock up other parts of sming?

Sming doesn't have threads. https://sming.readthedocs.io/en/latest/information/multitasking.html#co-operative-multitasking.

How about this:

database.setCommitCallback(CommitCallback);

BasicConfig::Root::Security sec(database);
sec.update(200, [](auto& obj) {
    if(obj) {
        obj.setApiSecured(false);
    } else {
        // Timeout
    }
});

void CommitCallback(String store, ConfigDB::Status status ){
  Serial << "error: cannot commit to store " << store << " error " << String(status) <<endl
}

So sec.update takes a timeout and a callback (in this case a lambda) which does the actual updating. The store queues the request and invokes the callback when possible. The timeout discards the queued request.

I'm not sure a timeout is even required in this situation. A 'stall' would be where the update request remains in the queue indefinitely, and we'd verify that can't happen via design/testing.

Also, using a capturing lambda allows you to pass in all the values to be updated.

pljakobs commented 3 months ago

beginUpdate() returns either a valid or 'invalid' Updater object, so other than OOM can fail for only one reason. It's possible to call commit on an invalid Updater so needs to return an appropriate error code, as well as reporting back any filing system error.

did you not plan to use a unique_ptr for the Updater object, thus would it not fail if that pointer has already been allocated?

Is there a way to have the class auto-commit in the destructor (so when the scope closes) and still get the error somehow?

Yes, we can store a lastError value in the database; IFS objects do this - see IFS::FsBase class.

Interesting point though, do we even need commit? If so, do we need rollback? I'm thinking we can get rid of it?

well, if we write back to storage in the destructor, that would be an implicit commit.

Yes, but it would need to do this with a timer or via the task queue. That raises the question of where to put the data in the meantime! In this scenario, someone else is updating the store so when the application retries it will need to decide whether it's still appropriate to use the data it originally intended to write or, more likely, it needs to re-check the config to make a decision about how to proceed. i.e. who is the winner here?

One thing that I was pondering was a log - maybe a vector of the database itself or of the updater object, every update is written into that vector with the current time and commit takes those updates and processes them in order of time.

I guess, this would be a major redesign of the way the classes work, though and I don't really understand the implications. Also: I believe we're really into corner cases here, conflicting updates to the same property .. yeah, possible, but really rare. I would still say: as long as consistency is kept, I can live with maybe an update failing (if I get the message that it has failed and can handle it in my app)

Or it could save the data to a backup database on a different partition whilst it then goes ahead and repairs/rebuilds the filing system/database, etc.

puh, that gets very sophisticated.

A beginUpdate failure is going to happen when a Writer stream object is receiving data from a remote web client. This happens in chunks and can take any number of iterations of the main task loop. So stalling won't accomplish anything. If Sming had a yield capability then this might work, but it doesn't; that route is a whole world of hurt as you either get into recursion or multithreading, multiple stacks, etc, etc.

... What I don't know is: if this one asynchrounous thread hangs - does that lock up other parts of sming?

Sming doesn't have threads. https://sming.readthedocs.io/en/latest/information/multitasking.html#co-operative-multitasking.

I was aware of that, but honestly am not aware how it handles asynchronous events internally. The last time, I've worked on a co-operative multitasking system was on NetWare 4 :-) But it does make sense here.

How about this:

database.setCommitCallback(CommitCallback);

BasicConfig::Root::Security sec(database);
sec.update(200, [](auto& obj) {
    if(obj) {
        obj.setApiSecured(false);
    } else {
        // Timeout
    }
});
void CommitCallback(String store, ConfigDB::Status status ){
  Serial << "error: cannot commit to store " << store << " error " << String(status) <<endl
}

So sec.update takes a timeout and a callback (in this case a lambda) which does the actual updating. The store queues the request and invokes the callback when possible. The timeout discards the queued request.

I'm not sure a timeout is even required in this situation. A 'stall' would be where the update request remains in the queue indefinitely, and we'd verify that can't happen via design/testing.

Also, using a capturing lambda allows you to pass in all the values to be updated.

I did think about lambdas, that's how you'd do this in something like JavaScript, I think it's a pretty elegant way to get around this. It would also allow to be either liberal about the failure or implement proper retry.

Only thing: I don't think I'm yet very clear on what the failure cases are when creating the Updater object

When it's oom, not a whole lot I can do but fail (or rather, in debug, make sure I have enough memory) when it's another Updater, I can set a timer to retry

mikee47 commented 3 months ago

beginUpdate() returns either a valid or 'invalid' Updater object, so other than OOM can fail for only one reason. It's possible to call commit on an invalid Updater so needs to return an appropriate error code, as well as reporting back any filing system error.

did you not plan to use a unique_ptr for the Updater object, thus would it not fail if that pointer has already been allocated?

Nope, got rid of those. An Object is just a reference, created on the stack. Only the store is allocated on the heap (tracked via shared_ptr).

Is there a way to have the class auto-commit in the destructor (so when the scope closes) and still get the error somehow?

Yes, we can store a lastError value in the database; IFS objects do this - see IFS::FsBase class. Interesting point though, do we even need commit? If so, do we need rollback? I'm thinking we can get rid of it?

well, if we write back to storage in the destructor, that would be an implicit commit.

I agree. I think we'll keep commit though as the return value may be of interest, but we'll make the call optional.

A beginUpdate failure is going to happen when a Writer stream object is receiving data from a remote web client. This happens in chunks and can take any number of iterations of the main task loop. So stalling won't accomplish anything. If Sming had a yield capability then this might work, but it doesn't; that route is a whole world of hurt as you either get into recursion or multithreading, multiple stacks, etc, etc.

... What I don't know is: if this one asynchrounous thread hangs - does that lock up other parts of sming?

Sming doesn't have threads. https://sming.readthedocs.io/en/latest/information/multitasking.html#co-operative-multitasking.

I was aware of that, but honestly am not aware how it handles asynchronous events internally. The last time, I've worked on a co-operative multitasking system was on NetWare 4 :-) But it does make sense here.

It's pretty simple. Sming has one queue for timers and another for tasks. The main loop just services those in order. If we can't take action immediately, we create an object or something, stick it on one of the queues and finish. A key feature is that when something (a 'task') gets pulled off the queue it's run with a clean stack: there's no nesting or reentrancy to bite us.

Only thing: I don't think I'm yet very clear on what the failure cases are when creating the Updater object

* out of memory
* another Updater is already being used

When it's oom, not a whole lot I can do but fail (or rather, in debug, make sure I have enough memory) when it's another Updater, I can set a timer to retry

pljakobs commented 3 months ago
* Update in progress: Store is 'locked' so we can't get write access to it. `set` and `commit` methods check for this.

understood, my thought is: why not check this upon creating the updater? That would make the error occur earlier and, I feel, easier to handle, no?

mikee47 commented 3 months ago
* Update in progress: Store is 'locked' so we can't get write access to it. `set` and `commit` methods check for this.

understood, my thought is: why not check this upon creating the updater? That would make the error occur earlier and, I feel, easier to handle, no?

Yes of course. But it doesn't stop someone from doing this:

BasicConfig::Root::Security::Updater sec(database);
sec.setApiSecured(false);
sec.commit();

So where store is locked, sec is invalid, so setXX does nothing and commit fails with denied. All I'm saying here is that the code will handle this gracefully rather than crashing.

pljakobs commented 3 months ago

ah, now I got it! I didn't think anyone would not correctly check for succss :-)

pljakobs commented 3 months ago

currently trying to build this for the Esp8266, I get this linker error:

/opt/Sming/Sming/esp-quick-toolchain/xtensa-lx106-elf/bin/../lib/gcc/xtensa-lx106-elf/10.3.0/../../../../xtensa-lx106-elf/bin/ld: /opt/Sming/Sming/out/Esp8266/debug/lib/clib-heap-e92ce383a1d93c576825dc47f463e4fe.a(alloc.o): in function `_ZdaPvj':
alloc.cpp:(.text._ZdaPvj+0x4): multiple definition of `_ZdaPvj'; /opt/Sming/Sming/out/Esp8266/debug/lib/libstdc++.a(del_opvs.o):/workdir/repo/gcc-gnu/libstdc++-v3/libsupc++/del_opvs.cc:33: first defined here
collect2: error: ld returned 1 exit status

Host builds ok

I'm not using the custom heap,

$ make list-config|grep -i heap
- ENABLE_CUSTOM_HEAP = 0

I have done a make dist-clean and make submodules-clean

seems to be something to do with deleting arrays?

[edit] just tested it for Esp32 (and re-installed esp-idf due to some esp32 python version issues) - works, this seems to be Esp8266 only. I have also set ENABLE_MALLOC_COUNT to 0

mikee47 commented 3 months ago

Just got this error myself! I've submitted a PR to Sming for this one.

mikee47 commented 3 months ago

I've pushed some changes as discussed above. I wonder if instead of beginUpdate we just use update:

/*
 * Get an updater object
 * Return value should be checked before proceeeding as another update may be in progress.
 */
Updater update();

We can then add an asynchronous overload (haven't looked at this yet):

/*
 * Update is performed within provided callback.
 * If possible, that happens immediately, otherwise it's queued.
 */
bool update(Callback myCallback);

We could have an optional timeout:

bool update(Callback myCallback, unsigned timeout = 0);

But my feeling is this is probably unnecessary and complicates things quite a bit. Can always look at adding it later if needed?

pljakobs commented 3 months ago

The question is: how to handle an "update in progress" situation. In the application, technically, for every update in an asynchronous path, I'll have to check if I can get an updater object. If yes, great, if no, I'll have to back off, try again and hope for the best. It just seems to do this right might take quite a bit of repetitive code in the application. (although as we've said, programmatic updates to the configuration should be rare, in most cases it should be a stream - btw how will we handle that for streams? after all, the same situation can arise and in an ideal world, the app would provide the correct response code to the client, telling it to come back later) The callback is clumsy with true callbacks, with lambdas, it becomes rather elegant although I guess many recreational programmers (like me) aren't very familiar with lambdas, creating a bit of a barrier to using it - that could easily be overcome with some good example code, I guess) Also, the callback version would make it difficult for the application to understand if the change was successful or not, again, making it difficult to notify a front end app.

Not sure which is the ideal route, but a synchronous set of functions is certainly less work on the implementation side and it shouldn't be such a frequent pattern to make it a real issue in code.

pljakobs commented 3 months ago

am I mistaken that there's currently only Updaters per Store? So no way to stream an update for the whole database? (I was just trying to test http stream in / out)

mikee47 commented 3 months ago

The write operation would use a POST request, I guess, with the body containing the JSON data. What I'm unclear on is the best way to set the 'body' stream for an incoming request. I suspect it involves creating a custom HttpResource class or plugin as shown in the HttpServer_Plugins sample.

We can write something very crude using the code from the Basic_WebSkeletonApp sample:

int onConfiguration(HttpServerConnection& connection, HttpRequest& request, HttpResponse& response)
{
    if(request.method == HTTP_GET) {
        // We have this in the BasicConfig sample
        auto stream = ConfigDB::Json::reader.createStream(database);
        auto mimeType = stream->getMimeType();
        response.sendDataStream(stream.release(), mimeType);
        return 0;
    }

    if(request.method != HTTP_POST) {
        response.code = HTTP_STATUS_BAD_REQUEST;
        return 0;
    }

    debug_i("Update config");

    auto body = request.getBodyStream();
    if(body == nullptr) {
        debug_e("NULL bodyBuf");
        return 0;
    }

//    auto stream = ConfigDB::Json::writer.createStream(database);
//    stream->copyFrom(*body);
    ConfigDB::Json::writer.loadFromStream(*body);
    return 0;
}

This is sub-optimal because the function only gets called after the entire JSON content is read into a MemoryStream (body). We do have the option to return some error status in the response though, so if there's a locking error that can be reported to the client and it can retry later.

pljakobs commented 3 months ago

hmmm... so in the current version, I use Json::deserialize(doc, request.getBodyStream()) - for which, as far as I understand, ArduinoJson directly serializes from the stream. Could we not copy from http bodyStream directly to ConfigDB writer stream?

in my simple mind, it looks a bit like this:

auto stream = request.getBodyStream();
auto success = updater.loadFromStream(database, *stream);
mikee47 commented 3 months ago

Found it, we use setBodyParser with a callback which processes each chunk of POST data as it comes in. Have updated the BasicConfig sample to demonstrate.

pljakobs commented 3 months ago

I feel like saying "I'm amazed" is getting a bit repetitive :-)

mikee47 commented 3 months ago

The sample is still a bit simplistic since in practice you'd probably want to check request URL path, add request parameters to authenticate the update, determine whether it's a full database update or for a specific JSONPath, etc.

mikee47 commented 3 months ago

Could we not copy from http bodyStream directly to ConfigDB writer stream?

Yes, of course! I'd missed that here, no need to create a stream. Hopefully you can see what I was getting at though? I'm thinking the JSON for an entire database could be quite large so keeping it out of RAM is preferable.

pljakobs commented 3 months ago

The sample is still a bit simplistic since in practice you'd probably want to check request URL path, add request parameters to authenticate the update, determine whether it's a full database update or for a specific JSONPath, etc.

so at least my frontend always creates a full json document, albeit sparsely populated. So when I change the ip-address, as an example, it would be '{"network":{"connection":{"ip":"aa.bb.cc.dd"}}}

Now, I remember you writing somewhere that currently, you drop the whole store before reading in a new one, so this would end up with the store being waived and only the sparse value being changed? If so, can I suggest to make that an optional behaviour? Maybe a "DELETE_UNPOPULATED" flag upon ingesting the stream?

Yes, of course! I'd missed that here, no need to create a stream. Hopefully you can see what I was getting at though? I'm thinking the JSON for an entire database could be quite large so keeping it out of RAM is preferable.

but isn't that just what the stream model would do?

mikee47 commented 3 months ago

What about arrays? Overwrite? Append? Insert? Delete?

pljakobs commented 3 months ago

What about arrays? Overwrite? Append? Insert? Delete?

ha! I would think there are use cases for each.

There's pitfals everywhere, it seems!

mikee47 commented 3 months ago

Not for this PR, or probably initial release, but I wonder if JSONPath, JSONP or somewhere else is a JSON structured query language... SQL INSERT, UPDATE, DELETE sort of thing...

pljakobs commented 3 months ago

is this maybe something we could annotate in the schema? At least for the indexed list, an annotation to the array, maybe like:

"channels": {
          "type": "array",
          "order": "indexed",
          "items": {
            "type": "object",
            "properties": {
              "pin": {
                "type": "integer",
                "default": 13,
                "minimum": 0,
                "maximum": 100,
                "is_index":true
              },
              "name": {
                "type": "string",
                "default": "red"
              },
              "details": {
                "type": "object",
                "properties": {
                  "purpose": {
                    "type": "string",
                    "default": "general"
                  },
                  "current-limit": {
                    "type": "integer",
                    "default": 15,
                    "format": "ms"
                  },
                  "garbage": {
                    "type": "string",
                    "default": "This is potentially\na problem\nstring /* with stuff inside\b\b it */"
                  }
                }
              },
              "notes": {
                "type": "array",
                "items": {
                  "type": "string"
                }
              },
              "values": {
                "type": "array",
                "order":"append",
                "items": {
                  "type": "integer",
                  "minimum": 0,
                  "maximum": 255
                }
              }
            }
          }
        },

with order being either indexed, append or replace? For indexed, the next level object either is the index or has to have a boolean of "is_index":true. If the index already exists, it is being replacecd, if it doesn't exist, the object is being added.

or maybe, rather than having the property flagged as index, have a field "index":"pin"


"channels": {
          "type": "array",
          "order": "indexed",
          "items": {
            "type": "object",
            "index":"pin",
            "properties": {
              "pin": {
                "type": "integer",
                "default": 13,
                "minimum": 0,
                "maximum": 100,
                },
mikee47 commented 3 months ago

I think this deserves a new issue to discuss as it's heading off-topic here. There's some commented-out code in the sample so I'll tidy that up, review the docs, etc. and get this PR merged, if you're happy with the coding API as-is?

mikee47 commented 3 months ago

Looking at the sample-config.json and inspecting the code I'd like to make some observations:

So is pin_config actually obsolete? If so, that just leaves general.channels which could be more accurately represented as a simple object:

"channels": {
    "red": 13,
    "green": 12,
    "blue": 14,
    "warmwhite": 5,
    "coldwhite: 4
}

That avoids the complexity of indexing related data items.

pljakobs commented 3 months ago

Looking at the sample-config.json and inspecting the code I'd like to make some observations:

* From `APPLedCtrl::parsePinConfigString`, _pin_config_ must contain exactly 5 pin numbers, in order, for red, green, blue, warmwhite and coldwhite

* `APPLedCtrl::init` notes that _general.channels_ is preferred over the 'old pin config' _channels.pin_config_

So is _pinconfig actually obsolete? If so, that just leaves general.channels which could be more accurately represented as a simple object:

"channels": {
    "red": 13,
    "green": 12,
    "blue": 14,
    "warmwhite": 5,
    "coldwhite: 4
}

That avoids the complexity of indexing related data items.

that's correct, the longer term idea here s to have a more flexible way to deal with different light configurations. Currently, the color compute knows three models: RGB, RGBW and RGBWWCW. On the larger Esp32s, the system has up to 15 pins, so I'm thinking I want to build someting like "virtual lights" so you can configure a light to just use three pins and be an RGB and two other lighs to be W only - hence the change in naming and number of configured pins.

Technically, the controller (the chip itself) defines how many available pins there are, so it could be a static object defined by the controller, but I also think that being able to name channels would be helpful.

mikee47 commented 3 months ago

Currently, the color compute knows three models: RGB, RGBW and RGBWWCW. On the larger Esp32s, the system has up to 15 pins, so I'm thinking I want to build someting like "virtual lights" so you can configure a light to just use three pins and be an RGB and two other lighs to be W only - hence the change in naming and number of configured pins. Technically, the controller (the chip itself) defines how many available pins there are, so it could be a static object defined by the controller, but I also think that being able to name channels would be helpful.

So perhaps these three model variants have their own object types. JSON Schema defines the oneof and anyof (https://json-schema.org/understanding-json-schema/reference/combining) which could support this behaviour. How it would look in code I'm not sure though... if this is of interest perhaps a new issue to discuss?

pljakobs commented 3 months ago

So perhaps these three model variants have their own object types. JSON Schema defines the oneof and anyof (https://json-schema.org/understanding-json-schema/reference/combining) which could support this behaviour. How it would look in code I'm not sure though... if this is of interest perhaps a new issue to discuss?

hmm... so other than pin naming, there is no code for this yet, but the idea is to have an API that can create multiple lights on the same controller, so that I could end up with

{
lights: [
  {
    "name":"light1",
    "model":"RGB",
    "pins":[3,5,11]
  },
  {
    "name":"light2",
   "model":"white",
   "pins";[2]
  },
 ...
  ]
}

so, thinking about it, pin names might not even be needed for that, but a flexible number of pins (as said, up to 15) would,with the controller only providing a MAX_PINS value. But the model above would still be easier if there was a way to "edit" an existing array (of lights, in this case). I like the suggestion of lights["light1"] on the issue. The simple solution is to always send the full config structure, which is okay, but, as you said, ram intensive. also, this feature is in the future, so no need to implement it right now (for me). I do believe it's useful, though.

On the model variants... hmm.. in that case, the schema would have pre-defined objects for the different light models, defining different pin layouts. I guess those objects would be distinguishable by having the correct "model" property and thus the difference could be handled well in code. It would actually have the advantage of basically being correctly "typed" for the model - if it's a light of model "RGB", it would have a pins object "pins":{"red":<int>,"green":<int>,"blue":<blue>} whereas a model of "WHITE" would only have "pins":{"white":<int>} - I could get used to that. But: this is only my use case right now.

mikee47 commented 3 months ago

hmmm... so in the current version, I use Json::deserialize(doc, request.getBodyStream()) - for which, as far as I understand, ArduinoJson directly serializes from the stream. Could we not copy from http bodyStream directly to ConfigDB writer stream?

in my simple mind, it looks a bit like this:

auto stream = request.getBodyStream();
auto success = updater.loadFromStream(database, *stream);

Exactly so, but an updater has no idea about storage formats so the call is actually on a Writer instance:

auto stream = request.getBodyStream();
auto success = ConfigDB::Json::writer.loadFromStream(database, *stream);

Notes:

am I mistaken that there's currently only Updaters per Store? So no way to stream an update for the whole database? (I was just trying to test http stream in / out)

So yes we do support streaming updates via the stream object returned from Writer::createStream, with corresponding methods in Reader instances. These support an asynchronous/chunked mode of operation which conserves memory.

Writing to files or MemoryDataStream instances is done via synchronous Writer methods:

bool loadFromStream(Store& store, Stream& source);
bool loadFromStream(Database& database, Stream& source);

Note that the signatures are different from a Reader:

size_t saveToStream(const Object& object, Print& output) const;
size_t saveToStream(Database& database, Print& output); // NB Should be `const`

Because Store has an Object interface it's more generic. I need to fix that for Writer (separate PR though).

pljakobs commented 3 months ago

small thing: I find loadFromStream and saveToStream a bit confusing - I think it might be a bit clearer if we could use importFromStream and exportToStream - small difference but I think the actual direction of data would be more obvious - at least to me.

mikee47 commented 3 months ago

Yes, I agree the naming isn't right, though readFromStream and writeToStream would be more consistent. Unless we also rename Reader to Importer and Writer to Exporter. Or even serialize and deserialize as in ArduinoJson.

mikee47 commented 3 months ago

The choice of naming is indicated by Reader (read from database/store/object) and Writer (write to database/store/object). Yes, technically what they're doing is serialization and de-serialization but that word irritates me because its verbose and subject to typos (as an English speaker I would use serialise).

mikee47 commented 3 months ago

Maybe it would be clearer if we could keep application code away from Reader/Writer entirely and instead use methods of Database, Store and Object. For example:

database.importFrom("filename.json", ConfigDB::json);
database.exportTo("filename.json", ConfigDB::json);
database.createImportStream(ConfigDB::json);
database.createExportStream(ConfigDB::json);

Here the user just indicates the desired serialization/deserialization format and the appropriate reader/write is selected.

pljakobs commented 3 months ago

this looks very clean indeed. and since in most cases, this will be a database level operation anyway, it makes total sense. Are there situations, where one would only want to stream a single store? I'm not sure. Would that still be possible?

mikee47 commented 3 months ago

Are there situations, where one would only want to stream a single store? I'm not sure. Would that still be possible?

Yes, as with Writer there are a set of methods for Database and another set for Object (which includes Store).