dropbox / json11

A tiny JSON library for C++11.
MIT License
2.55k stars 616 forks source link

operator== could be faster by checking node identity #104

Closed rressi closed 7 years ago

rressi commented 7 years ago

Story

When comparing two nodes we can speed up use cases where we reuse node instances partially or totally:

Json my_json = Json::object {
    { "key1", "value1" },
    { "key2", false },
    { "key3", Json::array { 1, 2, 3 } },
};
auto my_other_json = my_json;
if (my_json == my_other_json)
{
    ...
}

Two nodes sharing a portion of nodes:

Json common = Json::object {
    { "key1", "value1" },
    { "key2", false },
    { "key3", Json::array { 1, 2, 3 } },
};
Json a = Son::object {
    { "common", common },
    { "extra value", "value1" }
}
Json b = Son::object {
    { "common", common },
    { "extra value", "value1" }
}
if (a == b) {
    ...
}

It may be useful to apply the same concept also for operator<, in such case would always return false:

bool Json::operator< (const Json &other) const {
    if (m_ptr == other.m_ptr)
        return false;
    if (m_ptr->type() != other.m_ptr->type())
        return m_ptr->type() < other.m_ptr->type();
    return m_ptr->less(other.m_ptr.get());
}

These little changes would open memory + CPU optimisations of this sort:

Json reuse_instances(Json new_node)
{
   static std::set<Json> known_values;

   auto it = used_values.find(new_node);
   if (it != used_values.end()) {
       return it->second;
   }

   // may recursively do this for member nodes of object and arrays
   ...

   known_values.insert(new_node);
   return node;
}

...

Json my_object = reuse_instances(Json(Json::object {
    { "key1", "value1" },
    { "key2", false },
    { "key3", Json::array { 1, 2, 3 } },
}));
smarx commented 7 years ago

Automated message from Dropbox CLA bot

@rressi, thanks for the pull request! It looks like you haven't yet signed the Dropbox CLA. Please sign it here.

smarx commented 7 years ago

Automated message from Dropbox CLA bot

@rressi, thanks for signing the CLA!

j4cbo commented 7 years ago

Looks great - thanks!

rressi commented 7 years ago

I think this operator could be improved with the following implementation:

bool Json::operator== (const Json &other) const {
   return m_ptr == other.m_ptr
              && m_ptr->type() == other.m_ptr->type()
              && m_ptr->equals(other.m_ptr.get());
}

In this way the compiler would do the best in any case.

By the way, pointer comparison also speed up when nodes are both NULL.

artwyman commented 7 years ago

m_ptr is always non-null, otherwise the dereferences here would crash.

I like your version for clarity, though I think it's equivalent in performance. The compiler can't reorder short-circuit operators any more than it can reorder ifs. And JsonValue is a polymorphic object, so the virtual calls can't be inlined in order to let the compiler start eliminating cases.