Wait for all registered transformations to become available

The current implementation of transformer does not wait until it can resolve all registered transformations before firing the callbacks. To illustrate this problem (and what I expected would happen), I've added a test case here (which fails, of course):

BOOST_AUTO_TEST_CASE( callback_only_when_tf_available )

So what's happening here? With each new pushDynamicTransformation() call, transformer registers a new stream with the aggregator. When the first data sample arrives, it triggers the callback, regardless of whether the registered transformations are available or not.

Since it fired too soon, the data sample is already gone when the missing transformation finally arrives.

This can go on for a long time, e.g. if a robot arm driver has not yet been started up, so there are no joint states from that arm (and therefore no transforms). Once transformer has seen all transforms at least once, it behaves correctly, and registered transformations are always available within the callback.

What did I expect would happen? I expected transformer would check if all registered transformations were available and only triggered the callback then. See test case.

Why would it be nice if that behavior would change?

The client wouldn't have to check for failed transformations. It could assume that all transformations are available when the callback is called.
Transformer wouldn't throw away perfectly good samples in the beginning because it fired too soon.

This behavior totally puzzled me and took me forever to figure out; also because it's undocumented of course. So I would be happy with either of those solutions (most desirable first):

Change the default behavior of transformer. I don't see a use case where somebody would want the old behavior.
Add a parameter to registerDataStreamWithTransform() to switch it to the new behavior.
Clearly document that the callback will be called regardless of whether the transformations are available or not, so that the client knows they have to double-check this in their own code.

Hi. Thanks for the great analysis ! The real place where a fix would be desirable seems to be in the stream aligner actually. From your description what you are describing seems to be an initialization behaviour problem on the stream aligner. It seems that it behaves as if a stream is not active at initialization time.

The "better" behaviour from the stream aligner p.o.v. would be to behave as if a sample arrived just before initialization, thus starting to fire only when either a new sample arrived or the stream aligner reaches its max latency.

On a related note, it is IMO not desirable for the stream aligner/transformer to wait forever for all transformations to be available. It must honor its max_latency configuration. There's just not enough information to know which callbacks really require what transformation (it's implicit from the p.o.v. of the modelling), so we currently cannot optimize the general behaviour.

tl;dr: IMO, this bug isn't the stream aligner's fault, because transformer doesn't register any streams initially. Transformer also has no way to guess which streams to register initially. But transformer can check if it already has all required transforms for all the registered transformations. So what it should do (at least optionally) is to filter the callbacks it receives from the aligner and only pass them on when all registered transforms are available.

long version:

From your description what you are describing seems to be an initialization behaviour problem on the stream aligner. It seems that it behaves as if a stream is not active at initialization time.

Yes, this is because transformer doesn't know initially what transformations are expected to be there at some point. This is why it only adds a stream to the aligner when the first transformation between those frames arrives.

The "better" behaviour from the stream aligner p.o.v. would be to behave as if a sample arrived just before initialization, thus starting to fire only when either a new sample arrived or the stream aligner reaches its max latency.

This is a workaround I also came up with. Sometimes the client (i.e., the task that uses transformer) knows what transformations to expect (for example from parsing an URDF file). So a workaround is to have the client initially add one dynamic transform for each expected pair of frames to the transformer and set the time stamp to something far in the past. This causes the transformer to add the corresponding streams to the aligner, and everything works as expected. But there are a couple of drawbacks:

It sucks to have to do this manually in each client. Also, it's not documented anywhere in the transformer API that the client is expected to do that.
The orogen transformer plugin doesn't know the full expected transformation chain, so it cannot implement this workaround.
There might be cases where even the client doesn't know the full transformation chain. In general, I also think that the client shouldn't be concerned with that; it should simply say what transforms it needs (by registering the transform with the transformer) without specifying the full chain. Otherwise one has to touch all clients whenever the robot is reconfigured and the transformation chain changes.

On a related note, it is IMO not desirable for the stream aligner/transformer to wait forever for all transformations to be available. It must honor its max_latency configuration.

I agree. When not all transformations have arrived within max_latency, it should discard the data sample, right?

There's just not enough information to know which callbacks really require what transformation (it's implicit from the p.o.v. of the modelling), so we currently cannot optimize the general behaviour.

Yes, I realize that's a limitation of the current design. It's not optimal, because a unrelated missing transform blocks all others, but I don't see an easy fix for that.

The orogen transformer plugin doesn't know the full expected transformation chain, so it cannot implement this workaround.

No, but it can provide a way to be given the expected chains. The tooling (both orocos.rb and syskit) know the chains.

Yes, I realize that's a limitation of the current design. It's not optimal, because a unrelated missing transform blocks all others, but I don't see an easy fix for that.

Not easy, but there is a fix: rewriting the transformer and its integration.

the transformation chains are known at design time (they can be inferred from the transformer configuration file), which would allow us to initialize the underlying transformer properly
change the code generation to (1) list the transformations each callback need and (2) allow to configure at runtime whether this transformation should be interpolated or not. That would also allow to setup the stream aligner in a much MUCH smarter way and reduce latencies tremendously.
it would also allow more advanced tooling cough syskit cough to properly setup the underlying stream aligner periods. Yay for latency again.
we could even remove the multiplexing and have one port per transformation (which would then make the reliance on the notoriously unreliable sourceFrame/targetFrame fields unneeded)

Taker ?

rock-core / drivers-transformer

Wait for all registered transformations to become available #11