Closed phil-opp closed 5 months ago
With these changes, we can update ros2-client
to provide take_seed
and async_stream_seed
functions on Subscription
. I opened a draft PR for this at https://github.com/jhelovuo/ros2-client/pull/20 to show the required ros2-client
changes.
I'm not sure why the "Code Coverage" CI job fails, but it appears to be broken on the master branch as well: https://github.com/jhelovuo/RustDDS/actions/workflows/coverage.yml . So I don't think that the failure is related to this PR.
Hello,
Thank you for the interesting idea and PR. I was not aware of the DeserializeSeed
trait before. I can see the use case here.
If the deserialization would be changed as you propose, how would the serialization go then?
And no need to worry about the code coverage CI job.
Serialization doesn't need to be changed, since the serde::Serialize
trait gives access to the runtime value that should be serialized. So we can use the type information parsed from ROS2 message files in a custom Serialize
implementation.
For example:
pub struct TypedPythonValue<'a> {
pub value: &'a pyo3::Any, // given from Python
pub type_info: &'a Ros2TypeInfo, // parsed from ROS2 message files
}
impl serde::Serialize for TypedValue<'_> {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
match &self.type_info.data_type {
DataType::UInt8 => {
// the ROS2 topic expects an `u8` value, so try to convert the Python value
let value: u8 = self.value.extract().map_err(serde::ser::Error::custom)?;
serializer.serialize_u8(value)
}
.... // handle other ROS2 data types
_ => todo!(),
}
}
}
This approach doesn't work with the standard serde::Deserialize
trait because it is completely stateless, i.e. there is no self
argument. So for deserialization based on runtime type information, we need to use DeserializeSeed
.
Attention: Patch coverage is 52.17391%
with 66 lines
in your changes are missing coverage. Please review.
Project coverage is 73.26%. Comparing base (
28ed26f
) to head (4e8e37d
). Report is 130 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks for keeping your PR up to date. I assume this means you are using it for yourself.
We are considering accepting this PR, even though it adds extra complication. We may have another use case where it will be useful. This is just to let you know that the PR is not dead.
Thanks for the update! Yes, we're using this PR for the https://github.com/dora-rs/dora project. We need it because we precompile Python bindings on a different machine, which need to be able to deserialize ROS2 messages based on type information parsed at runtime.
Looking at the central part of the new design, namely, new code to adapters.rs
, terms "decode" and "transform" are used. Are these interchangeable? If yes, perhaps only one term should be used for clarity?
Looking at the central part of the new design, namely, new code to
adapters.rs
, terms "decode" and "transform" are used. Are these interchangeable? If yes, perhaps only one term should be used for clarity?
This question makes no sense. Apparently, "decode" and "deserialize" are more like synonyms. So the pipeline is
byte slice → deserialize/decode → value of type Self::Deserialized
→ call
[Self::transform_deserialized
] → value of type D
And it makes sense to use "decode", because the term "deserialize" is already taken.
Am I now on the correct track?
Another question: What is the use case for transform_deserialized()
?
In this PR and the corresponding ros2-client
PR it always seems to be the identity function.
I just created a new branch deserialize-with-seed for developing this feature. Please redirect the PR there.
Is there a specific reason in
fn from_bytes_with<S>(
input_bytes: &[u8],
encoding: RepresentationIdentifier,
decoder: S,
) -> Result<D, S::Error>
where
S: Decode<Self::Decoded>,
{
...
}
for the decoder
to be passed by value rather than reference? Correspondingly, Decode::decode_bytes()
takes self
rather than &self
. (Or maybe &mut self
?)
If using a reference is possible, then we could get rid of the Clone
requirement in DefaultDecoder::Decoder
. Requiring Clone
is a bit awkward, because it would really do the cloning at every sample processed.
So the pipeline is
byte slice → deserialize/decode → value of type
Self::Deserialized
→ call [Self::transform_deserialized
] → value of typeD
Yes, exactly!
Another question: What is the use case for
transform_deserialized()
?
There are some adapters that add a wrapper around the deserialized values. With the normal Deserialize
trait, this could happen behind the scenes because the adapters could just call deserialize
on the desired type. With DeserializeSeed
, the adapters need to explicitly specify the type that the DeserializeSeed
instance should create, which might be different than the output type D
.
In this PR and the corresponding
ros2-client
PR it always seems to be the identity function.
Yes, it's almost always the identity function. The one exception is no_key::DAWrapper
, which wraps a DeserializerAdapter<D>
, but returns a DeserializerAdapter<NoKeyWrapper<D>>
. So it's transform_deserialized
call wraps the result in a new NoKeyWrapper
instance:
NoKeyWrapper::<D> {
d: DA::transform_deserialized(deserialized),
}
(This implementation is at src/dds/no_key/wrappers.rs#L94-L96.)
I just created a new branch deserialize-with-seed for developing this feature. Please redirect the PR there.
I just tried, but it looks like all the commits of this PR are already part of the branch.
Is there a specific reason in
fn from_bytes_with<S>( input_bytes: &[u8], encoding: RepresentationIdentifier, decoder: S, ) -> Result<D, S::Error> where S: Decode<Self::Decoded>, { ... }
for the
decoder
to be passed by value rather than reference? Correspondingly,Decode::decode_bytes()
takesself
rather than&self
. (Or maybe&mut self
?)If using a reference is possible, then we could get rid of the
Clone
requirement inDefaultDecoder::Decoder
. RequiringClone
is a bit awkward, because it would really do the cloning at every sample processed.
I just added that code last week to get things working again with the latest changes from master
. I'm not satisfied with that part of the implementation either, so it would be great if we could find a better solution.
The reason for https://github.com/jhelovuo/RustDDS/pull/313/commits/4e8e37d6405a21f6d54c1d1255e7ae1d65f41720 was the loop
introduced in https://github.com/jhelovuo/RustDDS/commit/6908f40f68ddebbcc3f059b02cb5f7e64cac4986, which calls deserialize_with
multiple times.
I agree that taking a (mutable?) reference to the decoder would be the cleaner solution and would be fine for our use case. Unfortunately, the DeserializeSeed::deserialize
method takes self
by value, so I'm not sure how we could avoid a Clone
bound...
(Side note: If your DeserializeSeed
type T
only needs &self
, you could implement DeserializeSeed
for &T
as well. Then the .clone()
call would be just a pointer copy.)
So the pipeline is byte slice → deserialize/decode → value of type
Self::Deserialized
→ call [Self::transform_deserialized
] → value of typeD
Yes, exactly!
Ok. Your technical implementation is brilliant. This is quite complicated abstract metaprogramming.
In order to make it more accessible to others, I decided to rename some things. The basic renaming idea is
deserialize = decode + transform.
In your original code, there was confusion between deserialize and decode, as these were used as synonyms. Now it should be more clear that the whole two-step process is "deserialize" and the first step is called "decode".
I am migrating this PR to PR #334 , so we can continue the work there.
Is there a specific reason in
...code...
for the
decoder
to be passed by value rather than reference? Correspondingly,Decode::decode_bytes()
takesself
rather than&self
. (Or maybe&mut self
?) If using a reference is possible, then we could get rid of theClone
requirement inDefaultDecoder::Decoder
. RequiringClone
is a bit awkward, because it would really do the cloning at every sample processed.I just added that code last week to get things working again with the latest changes from
master
. I'm not satisfied with that part of the implementation either, so it would be great if we could find a better solution.The reason for 4e8e37d was the
loop
introduced in 6908f40, which callsdeserialize_with
multiple times.
The new loop was logically necessary, because it can happen that in a with_key
DDS topic, e.g. most of the Discovery topics, we could receive a Dispose instead of data update in a DATA submessage. The Dispose is indicated by sending a Key instead of Data, or alternatively, sending an MD5 hash of the Key.
In case we, as a DataReader, join an already long-running Topic, it may have some past data values published that we can no longer receive, because the History buffers are not deep enough. But we can at some point receive Dispose notifications of these samples we have never heard of. In case the Dispose was via a hash only, there is no way we can tell the application about the Dispose, because we do not know the actual key, so it is best to just ignore it, and move on to the next incoming sample. This is why the looping is needed.
This is something we could perhaps work around to avoid the need for cloning.
I agree that taking a (mutable?) reference to the decoder would be the cleaner solution and would be fine for our use case. Unfortunately, the
DeserializeSeed::deserialize
method takesself
by value, so I'm not sure how we could avoid aClone
bound...(Side note: If your
DeserializeSeed
typeT
only needs&self
, you could implementDeserializeSeed
for&T
as well. Then the.clone()
call would be just a pointer copy.)
Ok, the pass-by-value of self in DeserializeSeed::deserialize
is something we cannot work around.
Then we must keep the Clone
bound, and document that there will be quite a lot of cloning, about one .clone()
per deserialized sample, so the application designer knows to invest in making it cheap, like a pointer copy.
Thanks a lot!
Since the CDR format is not self-describing, the target type needs to be known for deserialization. The current RustDDS implementation ensures this by requiring a
serde::Deserialize
implementation for all target types. Unfortunately this makes it impossible to deserialize when the type information is only available at runtime. For example, it is not possible to base the deserialization on type information parsed from ROS2 msg files at runtime.This pull request solves this limitation by adding basic support for deserialization based on serde's
DeserializeSeed
trait, which allows to use additional state for deserialization.The main changes are:
Decode
trait for deserializing a byte slice with access to theRepresentationIdentifier
. The trait is implemented for the following new types:PlCdrDeserializer
is an empty newtype that implementsDecode
for all types that implementPlCdrDeserialize
.CdrDeserializeDecoder
is an empty newtype that implementsDecode
for all types that implementserde::Deserialize
.CdrDeserializeSeedDecoder
wraps aDeserializeSeed
implementation and implementsDecode
using the seed for deserialization.DeserializerAdapter
traitfrom_bytes_with
method that takes an additionaldecoder: impl Decode
argument. The method has a default implementation that calls the decoder.Deserialized
, which describes the type that the decoder will deserialize to.D
.DAWrapper
, need to do an additional transformation after deserializing. This transformation is possible by implementing a newtransform_deserialized
trait method. For example,DAWrapper
wraps the deserialized value into aNoKeyWrapper
.DefaultDecoder
sub-trait, which describes adapters that can be used without supplying a decoder (since they provide a default one).from_bytes
andfrom_vec_bytes
methods on theDefaultDecoder
trait and provide a default implementation using the*_with
variants and the default decoder.DefaultDecoder
trait is implemented:CDRDeserializerAdapter
ifD
implementsserde::Deserialize
.PlCdrDeserializerAdapter
ifD
implementsPlCdrDeserialize
.DefaultDecoder
trait to support both stateful and stateless deserialization in most functions, without duplicating the codetry_take_one_with
andas_async_stream_with
API methods to allowros2-client
and others to plug in custom stateful deserializers.I tried my best to limit the scope of this PR as much as possible and to document all the changes. Unfortunately, I failed to make this a non-breaking change, mainly because of the new
DeserializerAdapter
design. I hope you still consider this PR, as this feature is critical for our use case in the dora project.Please let me know if you have any questions or ideas for improvements!
(edit: The commit list got a bit messy, sorry. I can squash if you like.)