bblanchon / ArduinoJson

📟 JSON library for Arduino and embedded C++. Simple and efficient.
https://arduinojson.org
MIT License
6.71k stars 1.12k forks source link

question: Buffer usage and persistence #530

Closed sticilface closed 7 years ago

sticilface commented 7 years ago

I'm in the process of designing a wrapper class around AruinoJson that allows you to pass jsonobjects between functions, return jsonobjects, with little additional memory. I've done this using copy and move semantics, and its not working too badly. I know the buffers are throwaway, and they are here to, but i wanted a class that and be copied which maintains the root, and underlying buffer.

My question is this: (platform esp8266, arduino IDE)

take this example

  DynamicJsonBuffer buffer; 
  char jsonStr2[] =  "{\"sensor\":\"gps\",\"time\":1351824120,\"data\":[48.756080,2.302038]}";
  JsonObject & root = buffer.parseObject(jsonStr2);

my guess is that arduinjson uses pointers to the string and does not duplicate the memory.

now if i copy the elements from one object to the next

  DynamicJsonBuffer buffer2; 
  JsonObject & root2 = buffer.createObject();
  for (auto obj : root2) {
  root2[obj.key] = obj.value.as<JsonVarient>(); 
 }

Does this new buffer/object point to the original source jsonStr2 or the buffer...

And the big question.. what happens when these Strings / steams that are duplicated into the Json buffer.

using this code

void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  SPIFFS.begin();
  Serial.println();

DynamicJsonBuffer buffer; 
JsonObject & root = buffer.createObject();

  {
  DynamicJsonBuffer buffer2; 
  File f = SPIFFS.open("/espman/settings.json", "r"); 
  JsonObject & root2 = buffer2.parseObject(f); 
  for (auto obj : root2) {
    root[obj.key] = obj.value.as<JsonVariant>(); 
  }

  Serial.printf(" buffer = %u\n", buffer.size()); 
  Serial.printf(" buffer2 = %u\n", buffer2.size()); 
  }

  //  now original buffer has gone out of scope..  
  Serial.println("\n\n RESULT:"); 
  root.prettyPrintTo(Serial);
  Serial.println(); 

}

gives output

 buffer = 56
 buffer2 = 700

I'd expect both json buffer2 which holds the original stream parse copy to be large, and 700 bytes is aprox correct. your assistant says 670 bytes. but what I'm confused about is if i copy those keys into root, which is held in buffer not buffer2, id expect buffer to also be 700 bytes but yet the size of buffer is only 56. this implies that it is still referencing the data in buffer2, which should have gone out of scope?

Does any of this make any sense? Are your json buffers being clever, and not destructing until they are not being referenced by anything?

bblanchon commented 7 years ago

Hi @sticilface,

  1. When you copy one JsonVariant to another, they both point to the same string. (this is not a conscious decision, we could change this if it make sense). As, this is not what you want, you can test the type of the JsonVariant and call JsonBuffer::strdup() if it's a string.
  2. There are no reference counting in the JsonBuffer, they are destructed as per classic C++ rules.

BTW, thank you very much for sharing your thoughts on this subject. I start to believe that users want to be able to copy and move JsonObject and I wonder how I could implement this properly. I wish I could implement such an API:

// keep this when you don't want a copy
JsonObject& obj = jb.parseObject(json);

// but you could get a copy like this
JsonObject obj = jb.parseObject(json);

Now two problems appears:

  1. What is the JsonBuffer for the copied object? Maybe we could have an implicit/hidden DynamicJsonBuffer. (it cannot be a StaticJsonBuffer as the size if not known at compile time)
  2. How many users will make accidental copies?

What do you think? Does this matches your use case?

devyte commented 7 years ago

Do you keep a reference or pointer to the jsonbuffer inside each obj/arr? If so, I suggest cresting the obj/arr first from the jsonbuffer, then assigning. If the obj hasn't been created yet, fail the assignment.

If you want to make it more explicit, you could add a clone() or dup() method.

On Jun 18, 2017 8:32 AM, "Benoît Blanchon" notifications@github.com wrote:

Hi @sticilface https://github.com/sticilface,

  1. When you copy one JsonVariant to another, they both point to the same string. (this is not a conscious decision, we could change this if it make sense). As, this is not what you want, you can test the type of the JsonVariant and call JsonBuffer::strdup() if it's a string.
  2. There are no reference counting in the JsonBuffer, they are destructed as per classic C++ rules.

BTW, thank you very much for sharing your thoughts on this subject. I start to believe that users want to be able to copy and move JsonObject and I wonder how I could implement this properly. I wish I could implement such an API:

// keep this when you don't want a copy JsonObject& obj = jb.parseObject(json); // but you could get a copy like this JsonObject obj = jb.parseObject(json);

Now two problems appears:

  1. What is the JsonBuffer for the copied object? Maybe we could have an implicit/hidden DynamicJsonBuffer. (it cannot be a StaticJsonBuffer as the size if not known at compile time)
  2. How many users will make accidental copies?

What do you think? Does this matches your use case?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bblanchon/ArduinoJson/issues/530#issuecomment-309274979, or mute the thread https://github.com/notifications/unsubscribe-auth/AQC6BkndLlKKvLWBNSBlPCCIOu6RpO8Iks5sFRjngaJpZM4N9TSH .

sticilface commented 7 years ago

Thanks for the fast reply. The main issue is that there are two things you need, the buffer and the object, so that makes moving things around quite complicated. Your idea would work if there was a mechanism of accessing the underlying buffer from the object so you can call functions like size(), strdup() etc.

I'm mainly experimenting than anything else, but the way I've done it is based on these design principles.

1) objects should be copyable, movable and assignable. 2) objects should share a JSON buffer 3) objects copied and moved should remain identical and editable.
4) objects should be addable and sub tractable (+= and -= operator); += is done, others not. 5) There should be very little memory overhead for creating json objects and performing the above functions. 6) independent copies can be created by explicitly calling a copy() or clone() function .

eg.

char str[] =  "{\"sensor\":\"gps\",\"time\":1351824120,\"data\":[48.756080,2.302038]}";
JSON json; // first instance, default is a JsonObject as root
JSON json2(json); //  instance two same buffer.

json.parse(str);  //  now json2 is exactly the same thing. 

JSON json3(std::move(json)); //  now json has no buffer and json2 and json3 share the same buffer

JSON json4;
File f = SPIFFS.open("/file.json","r");
json4.parse(f);

json2 = json4; (now json2 and json3 are equal to json4); 

The way I've done this is to wrap the dynamicjson buffer in a class separate to the JSON object wrapper, its a bit like a shared_ptr implementation, but it keeps an std::forward_list of all the JSON parent objects. Now when you do anything to the top level JsonVarient it goes through each other JSON object sharing the same buffer and reassigns it. It might not be the cleanest method but it keeps all the JSON instances that share a buffer identical. With very little memory overhead.

The implementation is quite complex, so I'm wondering if you might be better to implement it as an addition to your lib, rather than building it in. although if you could that would be great.

back to the original question.

As a temporary fix duplicating all string would work. however, it would be more efficient if could be detected that the string is in fact stored in the buffer. Although to be safe it might be better to always assume that when copying. because a use can declare an instance of the class within a scope, where the const char * is. then the buffer is copied to a new JSON object, and that string goes out of scope invalidating the original json buffer.

interesting thoughts...

sticilface commented 7 years ago

Wonder if it is worth having a function for the jsonbuffer to always duplicate strings?

sticilface commented 7 years ago

I've actually figured out that it is not just that...

my function for copying only goes 1 level deep, I need to extract it into a function that recurses the whole json length, then applies the strdup to the key.

ESPmanagerinternals::JSON ESPmanagerinternals::JSON::copy()
{
    using namespace ESPmanagerinternals; 

    if (_root.is<JsonObject&>()) { 
        JSON json(false); 

        for (auto obj : _root.as<JsonObject&>())
        {
            if (obj.value.is<const char *>()) {
                json.get<JsonObject&>()[obj.key] = _buffer->get().strdup(obj.value.as<const char *>()); 
            } else {
                json.get<JsonObject&>()[obj.key] = obj.value; 
            }
        }
        return json; 

    } else {
        JSON json(true); 

        for (auto obj : _root.as<JsonArray&>())
        {
            if (obj.is<const char *>()) {
                json.get<JsonArray&>().add( _buffer->get().strdup(obj.as<const char *>())); 
            } else {
                json.get<JsonArray&>().add(obj.as<JsonVariant>()); 
            }

        }
        return json; 
    }

}
sticilface commented 7 years ago

ok... i've thought about this for a bit and I'm a bit stumped.

How can i traverse through a json object or array and copy it to another buffer exactly. My methods only iterate through the first level, which works when the buffer is the same, but when the buffer is different you have to iterate down to each individual key. whilst preserving the strings with strdup?

sticilface commented 7 years ago

thanks for the help. I made a issue so the solution would be easier to find. question... If the input source has to be copied into jsonbuffer, for example a stream, then the json keys will also need to be copied correct?

would you therefore need the following :

newObj[ jb.strdup(kvp.key)] = _clone(jb, kvp.value);

to duplicate the keys into the json buffer? this brings the size of the jsonbuffer up to near the size of the original parsed stream?

bblanchon commented 7 years ago

You're right, I fixed the Wiki.

On my side, I'm experimenting the idea of merging JsonBuffer and JsonObject. That would give something like:

// this would be the new buffer+object
StaticJsonObject<200> obj1;

// I would change the parse() function too:
int err = deserializeJson(obj1, input);

// you would still get reference in that case:
JsonObject& leaf = obj1["leaf"];

// but you could copy like this:
DynamicJsonObject obj2 = obj1;

What do you think? Would that be helpful?

devyte commented 7 years ago

@bblanchon When you do something like this:

DynamicJsonBuffer jsonBuffer;
JsonObject& root = jsonBuffer.createObject();
root["city"] = "Paris";
JsonObject& weather = root.createNestedObject("weather");
weather["temp"] = 14.2;
weather["cond"] = "cloudy";

Upon calling root.createNestedObject("weather"), I assume you have to allocate a new object, and the allocation happens inside the jsonBuffer. Doesn't this mean that you hold a reference to the jsonBuffer inside each jsonObject?

If that is true, then doing a copy/move from one json to another would require checking if the buffers are the same or different. If they are the same, then you can just copy the references over. If different, then you have to do a deep copy (aka clone).

If you don't hold a ref to the jsonBuffer inside the jsonObj, how do you do the allocates? from the heap?

merging JsonBuffer and JsonObject

I don't think this is a good idea. The buffer and the object/array are conceptually different, so shouldn't be merged.

sticilface commented 7 years ago

I'd have to get my head around it! The class i've made is a little special. it allows you to throw around JsonObjects (including the underlying buffer) with basically no cost. This addresses some of the issues people have mentioned. Even the assignment does not copy the whole buffer, it just creates a new reference, and the lifetime of the underlying buffer is extended to the two objects. Anything copied from, or to an object that shares the same buffer ends up being the same thing. I've done this by each buffer containing the pointer of all objects using it, then if one is changed it feeds back to all the other objects.

This is why i needed the clone function. The idea is to save memory but allow seamless use of the json objects. this is quite different to your principles on how arduinojson should be used, but affords me some flexibility in my design. I quite like my implementation as it carries very little overhead, I think each copy of a jsonobject and buffer is about 16 bytes.. with a move being neutral. If you start copying the buffers though that goes up to 100's bytes..

@devyte this is basically what i've done. All copies/moves do not duplicate the buffer but you can call a clone() function that returns a new instance with a new buffer which is totally independent. But as discussed above, the current jsonbuffer does not make a distinction between pointers to char arrays it holds that are static vs const char arrays that are not.. say parsed from a string or stream, or a local variable only intended to exist for the lifetime of that jsonbuffer. Therefore you have to strdup() everything.

sticilface commented 7 years ago

I really like the way arduinojson is implemented now. I wonder if a superclass is the way to go... so an additional layer to provide this functionality if your need it.

devyte commented 7 years ago

@sticilface I posted some code in #528 with a hierarchical iteration, where the key chain is maintained. It's not exactly what you need, but you should be able to tweak it to meet your needs.

devyte commented 7 years ago

@bblanchon if you keep a reference to a jsonBuffer inside a JsonObject (i.e.: the buffer from where the objet was originally allocated), then on clone/assignment/copy you can check whether the source's (right hand) is the same as the destination's (left hand) buffer. I see the following possibilities: If they are the same buffer, it is ok to not clone, but copy the references only. If they are different buffer, a clone should be done. If the destination doesn't have an assigned buffer, either error out, or create a new standalone buffer for it.

There is another design approach possible, which is the Allocator approach used by the STL. In this case, instantiation can take an Allocator as argument. I think it may be possible to port your implementation to comply with that approach without so much effort. Example in STL:

std::allocator<int> myalloc;
std::vector<int> myvec(myalloc); 

So you'd make your own allocator, which would be used internally by the objects, or something along those lines.

bblanchon commented 7 years ago

Hi guys,

Sorry for the late reply; it's been quite a crazy week.

the current jsonbuffer does not make a distinction between pointers to char arrays it holds that are static vs const char arrays that are not.. say parsed from a string or stream, or a local variable only intended to exist for the lifetime of that jsonbuffer

I realized that when implementing the clone() function. If ArduinoJson provides a way to duplicate JsonObject, it must make this distinction.

If they are the same buffer, it is ok to not clone, but copy the references only. If they are different buffer, a clone should be done.

That's exactly what I had in mind.

I thought about the STL way, and I think it's compatible with StaticJsonObject / DynamicJsonObject approach.

If the destination doesn't have an assigned buffer, either error out, or create a new standalone buffer for it.

I think this situation cannot happen if the constructor of JsonObject requires a JsonBuffer.

Regards, Benoit