nanosai / rion-ops-java

RION Ops for Java is a toolkit for reading and writing RION encoded data. RION is a compact, fast binary data format.
43 stars 7 forks source link

readElementCount and writeElementCount not working with objects #4

Closed hvbtup closed 4 years ago

hvbtup commented 4 years ago

I tried to write an object like this

writer.writeObjectBeginPush(maxLengthLength);
Map<String, Object> uiProps = ...;
int numProperties = uiProps.size();
writer.writeElementCount(numProperties);
// writer.writeInt64(numProperties);
for (Entry<String, Object> property : uiProps.entrySet()) {
    String propertyName = property.getKey();
    byte[] propIdBytes = propertyName.getBytes("utf-8");
    writer.writeKeyOrKeyShort(propIdBytes, 0, propIdBytes.length);
    Object propVal = property.getValue();
    if (propVal == null) {
        writer.writeBooleanObj(null);
    } else if (...) ...
    } else {
        writer.writeUtf8(propVal.toString());
    }
}
writer.writeObjectEndPop();

And later on, to read it:

if (reader.fieldType != LionFieldTypes.OBJECT) {
    throw new IOException("Expected field type OBJECT but found "
                        + String.valueOf(reader.fieldType));
}
reader.moveInto();
reader.parse();
int numProperties = reader.readElementCount();
// int numProperties = (int) reader.readInt64();
Map<String, Object> propertyMap = new HashMap<>(numProperties + numProperties >> 1);
for (int propertyIdx = 0; propertyIdx < numProperties; propertyIdx++) {
    if (!reader.hasNext()) {
        throw new IOException("Expected more properties, but the data ends here");
    }
    reader.nextParse();
    String propertyName = null;
    switch (reader.fieldType) {
    case LionFieldTypes.KEY_SHORT:
        propertyName = reader.readKeyShortAsUtf8String();
        break;
    case LionFieldTypes.KEY:
        propertyName =  reader.readKeyAsUtf8String();
        break;
    default:
        throw new IOException("Expected KEY or KEY SHORT with propertyName, but found"
                + String.valueOf(reader.fieldType));
    }
    reader.nextParse();
    Object propertyVal = null;
    if (reader.fieldLengthLength != 0) {
        switch (reader.fieldType) {
        ...
        case LionFieldTypes.UTF_8:
        case LionFieldTypes.UTF_8_SHORT:
            propertyVal = reader.readUtf8String();
            break;
        default:
            throw new IOException("Unsupported field type " + reader.fieldType);
        }
    }
    propertyMap.put(propertyName, propertyVal);
}
reader.moveOutOf();

This failed when reading. It seems that the extended field with the type of 16 (meanging: element count) is not handled correctly when reading, this causes an off-by-one-error for the index.

As a workaround, I replaced writeElementCount and readElementCount with writeInt64 and readInt64, then it worked (see the commented lines in the code).

But this is probably not the way it's supposed to be.

It seems that the intention of using an extended type for specifying the element count has the following intention: The element count is optional. By using an extended type the element count can not be misinterpreted as something else, e.g. an integer value, if the keys are left out and the object contains just the values.

jjenkov commented 4 years ago

Only RION Array and RION Table fields have a nested element count as first field. RION Object fields do not have this element count field.

The reasoning is, that for arrays and tables you will often have to allocate a Java array to hold the read values. The element count helps you allocate an array of the correct size before reading any of the elements.

An Object, on the other hand, is normally read into a Java object, or its fields are read one by one into local variables. Therefore a nested element count is not really "interesting" for Object fields.

jjenkov commented 4 years ago

Also, there was a time where the nested element count field of Array and Table was its own field type. This has later been changed to an IntPos field instead. Apparently the readElementCount() method has some issues still in this regard.

jjenkov commented 4 years ago

Remember to do a nextParse() right after moveInto() . In your code examples you are only doing a parse() ... and I cannot remember if that is enough right after a moveInto() call - but a nextParse() call will definitely work.

jjenkov commented 4 years ago

I think that the readElementCount() method should simply be removed. Now that element counts are simply IntPos fields, there is no reason to have an explicit method for reading element counts. That will just confuse people I think.

jjenkov commented 4 years ago

By the way, I have updated the RION Encoding tutorial to better explain how the encoding looks. The RION Encoding tutorial now contains encoding examples in hexadecimal notation for each field type.

http://tutorials.jenkov.com/rion/rion-encoding.html