breese / trial.protocol

Network wire protocols
11 stars 4 forks source link

Compare current string-value with a string_view w/o allocating #41

Closed vinipsmaker closed 4 years ago

vinipsmaker commented 4 years ago

I have some loops like the following on my codebase:

for (;;) {
    // Key
    if (reader.symbol() == json::token::symbol::end_object) {
        reader.next();
        break;
    }

    assert(reader.symbol() == json::token::symbol::string);
    auto current_key = reader.literal();
    current_key.remove_prefix(1);
    current_key.remove_suffix(1);

    if (!reader.next())
        throw Error("bad object");

    // Value
    try {
        if (current_key == "TestReqID") {
            msg.header.req_id = reader.value<decltype(msg.header.req_id)>();
            read_req_id = true;
        } else if (current_key == "SendTime") {
            msg.send_time = reader.value<decltype(msg.send_time)>();
            read_send_time = true;
        }
        json::partial::skip(reader);
    } catch (const json::error&) {
        throw Error("bad object");
    }
}

It'd be nice if there was a built-in function to compare current string-value with a user-provided string_view. Doing so would spare a call to reader.value<std::string>() which is an allocating function.

breese commented 4 years ago

Do you want the user-provided string_view to contain the literal or converted value?

vinipsmaker commented 4 years ago

Do you want the user-provided string_view to contain the literal or converted value?

The converted/decoded value. My current literal-using code obviously does something bad and should not be encouraged. The built-in function should do the right thing and decode current value character by character while it compares with the user provided string.

breese commented 4 years ago

Using the converted value requires allocation.

vinipsmaker commented 4 years ago

But it's a specialized use. It's not delivering the value for the user to do as he wishes. It's only checking if it's equal to a specific string.

Decoding can be done char-by-char. As it decodes, it checks against the user-provided string. Buffering a single char (or a few, it doesn't matter) doesn't require allocation. The past decoded chars are forgotten as the comparison goes on.

breese commented 4 years ago

Such a change would be quite intrusive.

vinipsmaker commented 4 years ago

Such a change would be quite intrusive.

Do you mean implementation-wise or interface-wise? I'll take a look at the trial.protocol codebase later.

breese commented 4 years ago

Implementation-wise.

A better solution would be to convert the search string to JSON using json::writer during initialization and then compare this with the literal value. Long term we could even have a constexpr string-to-JSON conversion functions for this purpose.

vinipsmaker commented 4 years ago

A better solution would be to convert the search string to JSON using json::writer during initialization and then compare this with the literal value. Long term we could even have a constexpr string-to-JSON conversion functions for this purpose.

I'm not sure we're on the same page. I'll try to explain the problem again.

Suppose I have the following C++ type:

struct MyType
{
    int x;
    std::string y;
};

And the following JSON should be mapped to the C++ type:

{
    "x": 33,
    "y": "some string"
}

Now, when reader points to the "x" value, I only want to know if reader is pointing to "x" or pointing to "y". Currently, I compare the literals. This approach fails because there are multiple literals that can map to the same decoded value. For instance, the received JSON could be the following:

{
    "\u0078": 33,
    "\u0079": "some string"
}

When reader is pointing to "some string" I can use reader.value<std::string>() and live with that. There isn't much I can do. But when I already know the decoded value, I could use some function that does the comparison directly.

bool is_equals(const json::reader& reader, std::string_view search)
{
    auto value = reader.literal();
    value.remove_prefix(1);
    value.remove_suffix(1);
    for (;;) {
        if (value.size() == 0 && search.size() != 0)
            return false;
        auto c = decode_and_consume_char(value);
        if (c != search[0])
            return false;
    }
}

The pseudo-code is incomplete and it isn't handling the error cases properly, but should already give an idea of what I'm talking about.

A better solution would be to convert the search string to JSON using json::writer during initialization and then compare this with the literal value. Long term we could even have a constexpr string-to-JSON conversion functions for this purpose.

I don't follow. The literal value is harder to compare because there are multiple literal values that can map to a single decoded value.

breese commented 4 years ago

Ah, I understand.

breese commented 4 years ago

We discussed off-line to add a "string collector" function to json::reader which takes a template collector function. This can be used to either store the converted string in an std::string, a fixed sized array, or to compare against another string.

This has been committed in 0f1d2d84daa56658996ec4d7bc0185f7a4114f90. The decoder test suite shows a comparison example.

vinipsmaker commented 4 years ago

I'm busy with some stuff. Will have time to take a deeper look later.

vinipsmaker commented 4 years ago

I managed to use the string collector API successfully. Closing the issue.

Next, I'll take a look at #16. I already have a working prototype, but I have to fill a few holes before presenting it for discussion.