Closed phylib closed 3 years ago
Yes, that is a good concern.
Previous Information
Python-ndn has different mechanics compared to C++ for TLV. Therefore, I had to use this structure which can be viewed in code here:
StateVector = *StateVectorComponent
StateVectorComponent = VERSION-VECTOR-COMPONENT-TYPE TLV-LENGTH NodeID SeqNo
NodeID = VERSION-VECTOR-KEY-TYPE TLV-LENGTH *OCTET
SeqNo = VERSION-VECTOR-VALUE-TYPE TLV-LENGTH NonNegativeInteger
VERSION-VECTOR-COMPONENT-TYPE = 201
VERSION-VECTOR-KEY-TYPE = 202
VERSION-VECTOR-VALUE-TYPE = 203
Adding to this, I had to also wrap a StateVector TLV into a component and used the generic type for that.
Why is it this way?
I did ask the creator of python-ndn about the TLV as I had the same concerns.
This is what I know since then... python-ndn Model needs you to encapsulate StateVectorComponent with a TLV number, instead of StateVector
When defining a RepeatingField TLV (for the StateVector), it requires a TLV number, and this is added to every component. On top of this, to be a component, I had to wrap the StateVector TLV within some type (I used generic for this).
Moving Forward
I am not sure how to actually fix this as the C++ version combines these two steps into one or this is how I think about it.
Hello. I suggest this pattern because
Phillip told me that the format should be consistent for now. If this must be the case, we can decode StateVector manually:
import enum
import typing
import struct
import asyncio as aio
import ndn.encoding as enc
class StateVectorModelTypes(enum.Enum):
VECTOR = 201
KEY = 202
VALUE = 203
class StateVectorComponentModel(enc.TlvModel):
node_id = enc.BytesField(StateVectorModelTypes.KEY.value)
seq_num = enc.UintField(StateVectorModelTypes.VALUE.value)
class StateVectorModel:
value: typing.List[StateVectorComponentModel]
def encode(self) -> bytearray:
component_wires = [v.encode() for v in self.value]
length = sum(len(w) for w in component_wires)
buf_len = length + enc.get_tl_num_size(length) + enc.get_tl_num_size(StateVectorModelTypes.VECTOR.value)
ret = bytearray(buf_len)
pos = enc.write_tl_num(StateVectorModelTypes.VECTOR.value, ret)
pos += enc.write_tl_num(length, ret, pos)
for w in component_wires:
ret[pos:pos + len(w)] = w
pos += len(w)
return ret
@staticmethod
def parse(buf):
# Verify the Type
typ, pos = enc.parse_tl_num(buf)
if typ != StateVectorModelTypes.VECTOR.value:
return None
# Check the length
length, l = enc.parse_tl_num(buf, pos)
pos += l
if pos + length != len(buf):
return None
# Decode components
ret = StateVectorModel()
ret.value = []
while pos < len(buf):
# Node ID
typ, l = enc.parse_tl_num(buf, pos)
pos += l
if typ != StateVectorModelTypes.KEY.value:
return None
length, l = enc.parse_tl_num(buf, pos)
pos += l
node_id = buf[pos:pos + length]
pos += length
# Value
typ, l = enc.parse_tl_num(buf, pos)
pos += l
if typ != StateVectorModelTypes.VALUE.value:
return None
length, l = enc.parse_tl_num(buf, pos)
pos += l
if length == 1:
value = struct.unpack_from('!B', buf, pos)[0]
elif length == 2:
value = struct.unpack_from('!H', buf, pos)[0]
elif length == 4:
value = struct.unpack_from('!I', buf, pos)[0]
elif length == 8:
value = struct.unpack_from('!Q', buf, pos)[0]
else:
return None
pos += length
# Append the component
comp = StateVectorComponentModel()
comp.node_id = node_id
comp.seq_num = value
ret.value.append(comp)
return ret
async def main():
vec = StateVectorModel()
vec.value = [StateVectorComponentModel(), StateVectorComponentModel()]
vec.value[0].node_id = b'one'
vec.value[0].seq_num = 1
vec.value[1].node_id = b'two'
vec.value[1].seq_num = 2
buf = bytes(vec.encode())
print(buf)
vec = StateVectorModel.parse(buf)
print(vec.value[0].node_id)
print(vec.value[0].seq_num)
print(vec.value[1].node_id)
print(vec.value[1].seq_num)
if __name__ == '__main__':
aio.run(main())
Thank you, zjkmxy for your addition. Yes, avoiding the RepeatingModel and making our own StateVectorModel allowed us to follow the TLV that the specifications use.
Since this is the case, our model used is the one specified.
I just tried to add the TLV test from the C++ implementation and found that the TLV format of the Python implementation is not yet perfectly correct.
The TLV
\xCA\x03\x6F\x6E\x65\xCB\x01\x01\xCA\x03\x74\x77\x6F\xCB\x01\x02
contains the state vector entries['one': 1, 'two': 2]
. When wrapping it into the TLV Type201
, it should be possible to parse. However, parsing is not possible.The hex values C9, CA and CB represent the TLV types 201..VersionVector, 202..VersionVectorKey, and 203...VersionVectorValue
When creating the corresponding state vector in Python, I get the following TLV: \xc9\x08\xca\x03\x6F\x6E\x65\xcb\x01\x01 \xc9\x08\xca\x03\x74\x77\x6F\xcb\x01\x02
As far as I know, there should be only one 201 component for a state vector. The Python implementation, however, has one 201 plus one additional 202 component for every state vector entry.
Additionally, TLV parsing appears to have an additional minor issue. As indicated here, the
StateVector
class assumes to receive a TLV generated byStateVector-->to_component
. Theto_component
method without parameter, however, appears to add the generic TLV typeTYPE_GENERIC = 0x08
. Instead, the TLV Type 201 StateVector should be prepended.