GobySoft / dccl

Dynamic Compact Control Language
Other
17 stars 13 forks source link

Dev dynamic ids #39

Closed Robotics-SLB closed 5 years ago

Robotics-SLB commented 5 years ago
tsaubergine commented 5 years ago

Having a one-to-one fixed mapping between Protobuf names and DCCL IDs is a pretty fundamental part of DCCL. Could you motivate the use case for this proposed change?

Robotics-SLB commented 5 years ago

The usage will be quite simple: instead of calling say encode(bytes, msg, header_only)

user can call encode(bytes, msg, header_only, user_defined_dccl_id_corresponding_to_msg)

and then upon decoding first read custom dccl_id, by using id(received_bytes) and then decode message as usual.

Obviously it will be user responsibility to main a correspondance table between runtime dccl id and message types (or any other message features) if they ever wish to use new functionality

tsaubergine commented 5 years ago

Thanks for the additional info.

@chrismurf Could you weigh on this if you get a moment?

chrismurf commented 5 years ago

Sorry for the length, but if you're asking...

I have mixed feelings about it. As Toby points out, DCCL inherits the simple ID-to-content mapping from CCL, which makes it very easy to differentiate traffic that you understand from traffic that you don't. In the "multi-vendor UUV collaboration" use case that DCCL (and CCL) initially addressed, it is important to clearly delineate which messages can be decoded. In these use cases, DCCL represents somewhat of a loose standard - I can go to the DCCL ID Table and lookup the purpose or at least owner of any message that is encountered. That standardization was one of the goals of DCCL. The more dynamic the concept of the 'ID' gets, the harder it will be to exchange standard messages between different platforms.

Where some creative usage of the ID is required, DCCL tries to accommodate those use cases by allowing a custom ID codec to be provided to the dccl::Codec constructor. A custom ID codec that compactly included a Queue ID and Message ID seems like it could address both of the specific concerns SLB identifies. The Queue ID wouldn't necessarily need to be transmitted through the water - it could be probably be prepended during decoding?

On the other hand, as underwater networking and communications evolves, the simple ID-to-content mapping does become limiting, Custom ID codecs may break the 'standards' use case described above, without really being as flexible as what SLB is requesting. Transport-layer multiplexing (multiple queues) present an interesting such use case - it is expensive to provide a content ID and a queue ID within each message. We are also running into issues (that Toby and I have discussed) when we look at layering session and encryption information onto a packet. There may be contextual information that allows inferring protobuf type without relying on a simple integer mapping. That obviates the need to repeat a simple 'type' identifier in every message. In these use cases, DCCL is more of a data packing/unpacking/compression library.

I am increasingly focused on using DCCL in that second set of use cases, where a simple integer identifier is limiting. I don't want to give up on the idea of an undersea data standard, but I think there are a number of other issues that prevent interoperability right now (modulation, manufacturers, etc.) which probably need to be considered more holistically. I'm not totally convinced that this merge request is the best way through the trade space (it maintains the integer ID concept, while making it almost meaningless) but it does remain nicely backwards compatible, and support understandable use cases.

This feels like a merge that re-shapes the direction of DCCL. As such, I think I'd personally vote for a bit more discussion and consideration before merging, but the decision is yours Toby.

Other edits / comments:

Robotics-SLB commented 5 years ago

Thanks for the feedback, I have updated docs. An update to unload i not a bug fix, but rather a way to make it compatible with both approach (in case of fixed dccl id it will work as it used to be, in case of dynamic id it will unload all messages with same descriptor)

Yes, I agree it is probably not the best way to incorporate the new feature. But it seems to be the one with minimal code change. In fact in SLB we mostly consider dccl as compression library as @chrismurf mentioned, that is why we generally would like it to only apply compression and decompression and ideally to provide (unencoded ?) header ourselves, this would allow to customize network comms in a variety of ways. The problem that dccl id in most cases occupies already 2 bytes, so for short messages of about ~10 B it becomes a bit expensive to add another message-type header. So ideally it would be better to have methods which would provide only encoded message body and which decode message body based on some key (which the hack in pr implements). But judging by the structure of dccl::Codec class it might require a significant amount of refactoring since dccl id usage is sort of hard coded in the design. So we would be really grateful to you, Toby, if you could keep this hack as a part of your code, until you have time to properly redesign it to make it work in the specified use cases.

tsaubergine commented 5 years ago

Thanks @chrismurf and @Robotics-SLB. Let me think about this for a bit and come up with a decision. I think DCCL4 is imminently something I'd like to work on (at a minimum to fix a few issues with DCCL3, such as lack of Oneof support, and the slightly broken String codec), but I'm not sure what direction this will take, and these ideas are helpful.

I'm not convinced that SLB's use case can't be solved using a dummy CustomIdCodec? I can provide an example, if helpful. I've done this a few times when I've external knowledge of the message ID (e.g. a one-to-one mapping of UDP socket to message type, for example).

Robotics-SLB commented 5 years ago

Hi Toby, I would like to see an example of how CustomIdCodec would help, although I do not not see how given your Codec::load(...) implementation one can load same message type more than once (i.e. under different dccl? ids without using more space)

tsaubergine commented 5 years ago

You are correct about the load() not taking into account the custom ID codec. I've put an alternative PR that fixes this and demonstrates what I think is your use case in a new unit test: https://github.com/GobySoft/dccl/pull/42

I would much prefer to use the existing custom ID codec infrastructure. Please let me know if #42 works for your use case, or if not, please let me know why.

Robotics-SLB commented 5 years ago

Thanks Toby. It seems to work, although looks a bit ugly :). but I believe all the details can be hidden by subclassing dccl::Codec or putting it inside another class. The only problem I see is that when you decide to go multi-threaded, it will break (or will need to stay single threaded) so it is unlikely that this hack will stay supported in the (far?) future, at least in this form. I would frankly prefer that you keep my approach (i.e. passing dccl_id explicitely since it makes more sense due to one-to-one correspondence in dccl ids and decoding algorithm, not sure if there is any reason to pass dccl_id on encoding, except for keeping compatibility with existing code), but it is up to you to decide. Ideally it would be perfect if in the future dccl only will only provide compression/decompression and all the headers will be added externally by users.

tsaubergine commented 5 years ago

DCCL3 isn't thread safe anyway (FieldCodecManager and others use some static data types). I will probably fix this with DCCL4.

Another option is to split encode()'s implementation into encode_header() and encode_body(), which are also public methods so that one can just call "encode_body()" and "decode_body()", which would be want I think you'd want to just provide compression/decompression.

I'm leaning towards accepting this PR at the moment, and revisiting this question for DCCL4. As you point out, it's fully backwards compatible.

tsaubergine commented 5 years ago

I'm going to accept this PR and hopefully address this more holistically in DCCL4.