Logical Flows: Display of 'via' flows

rovats commented 4 years ago

Scenario:

Let's say we have three applications:

Producer, Consumer and Distributor

Producer sends Trades data to Consumer via Distributor

The requirement is to display this flow on each app's data flows section like this:

Producer:

Distributor:

Consumer:

rovats commented 4 years ago

Currently, we are only creating flows from Producer to Consumer, which means when you view the Distributor app, there are no flows displayed, which is not desirable.

One option to do this is add a via_entity on logical_flow table, and handle this when displaying flows.
Second option is to use the decorator table to add a new decorator type: DISTRIBUTOR (or something similar), and display these flows on the Distributor entity, and also potentially show the decoration on Producer and Consumer.
Third option is to add something like a DISPLAY_ON field on the logical_flow table with values like SOURCE_ONLY, TARGET_ONLY, BOTH

rovats commented 4 years ago

@davidwatkins73 thoughts? The third options seems to be easiest to do, but we also need to decide which flows should be displayed on the boingy graphs, potentially only the ones with BOTH as the DISPLAY_ON value?

We may add toggles to display hidden flows on all these diagrams potentially.

rovats commented 4 years ago

Linking this to #4788 , since using a combination of tags and default tag behaviour (via logical_flow_decorator, eg: hide_on_target etc) could potentially solve this in a better way.

davidwatkins73 commented 4 years ago

We've been having some discussions on similar topics.

In one camp we have the logical flows are close to physicals in that their source and targets should be 'accurate'. In the other camp we have a desire to skip distributors.

There was quite a lot of opposition to a via field when it was proposed here by the first camp and also the second camp didn't think only one level of indirect hop would be enough.

Everything get's even muddier when we have distributors which sometime do some slight enrichment/combining of data. Would they be excluded ?...

That said, there are some other options which we could consider:

skipping intermediaries but using transport to encompass distributors (this is routed in the physical so may be tricky)
keeping intermediaries but adding "ultimate destination/s" decorators on logicals (this allows the existing direct mappings to be progressively enhanced, rather than needing to be re-declared with a 'via' field).

Let's set up a short call later in the week to discuss the options with Mark and I'll do a bit more asking around here first to see what other use-cases people have

davidwatkins73 commented 4 years ago

Another aspect being actively considered here is at the physical, by allowing physical flows to link to measurables. One of the measurables we have is the tech-product taxonomy which lists all the hosted middleware options (manage databases, middleware, ETL, scheduling systems etc). This relationship would be a 'depends on' relationship. This approach also falls into the skipping intermediaries approach.

The first option I outlined above (using transport) and this are both rooted in the physical, which is arguably where this needs to be as the logical with a 'via' may not be accurate for all instances of communication between two parties (i.e. App A and B communicate via several different intermediaries).

rovats commented 4 years ago

Yep, let's discuss in a call.

I lean towards the first camp, that the 'source' and 'targets' for logical flows should be accurate and match physical flows, so I think we should create logical flows for distributors. As you said there could be distributors who store/enrich data, and they shouldn't (can't?) be excluded if we want to draw an accurate lineage diagram I suppose (regulators?).

On the distributor app, these flows should be visible, otherwise it gives an inaccurate picture, unless we also create a physical source & target graph similar to logical, but even then, we would need to link distributors to physical flows somehow.

On the producer and consumer apps, ideally they should be hidden by default, but there should be an option to unhide and view these on the logical flow graph. If we just hide them magically as if they didn't exist, it might confuse users.

The logical flow source and target graph, and the boingy graph work well, and users understand the concept of drilling down to physical flows if needed (most just want to look at the diagrams). So, I think there needs to be a way to enrich these logical diagrams with additional (ideally filterable) information.

I had also raised: https://github.com/finos/waltz/issues/4788 (please let me know your thoughts on that one if you get time), to add tags to logical flows and filter flows based on tags, which is a nice feature in itself, but it will potentially solve this issue.

Eg. tags in this case: Producer1 -> DistributorA: To DistributorA DistributorA -> Consumer1: From DistributorA Producer1 -> Consumer1: via DistributorA

If we allow users to filter the app source/target graph and the group level boingy graph based on these filters, then this potentially resolves the problem (of cluttered flow diagrams), while still maintaining accurate flow information.

Second stage of this could be to mark tags as hidden by default (on logical flow source/target), may be by using the decorator table? This should then complete the solution to this issue imo. Eg: mark To DistributorA as hidden on source by default, and From DistributorA as hidden on target by default.

rovats commented 4 years ago

Another approach could be this:

We create logical flows as normal, so in the example above we'll create the following logical flows:

id	source	target
1	Producer1	DistributorA
2	Producer2	DistributorA
3	DistributorA	Consumer1

Now, if we are able to define Upstream(s) for a a logical_flow, we could set logical flow id 1 (Producer1 -> DistributorA) as the Upstream source for logical flow id 3 (DistributorA -> Consumer1).

This can be done using decorators potentially.

Advantages:

Logical flows are created normally, so each app owner needs to define only one level of upstreams/downstreams as normal
Then, logical flows can be linked progressively, one hop at a time, effectively formalising lineage
Chaining flows this way will enable us to know via apps at any hop of the chain
Impact analysis can be done across multiple hops, as opposed to just one hop

mharward-gr commented 2 years ago

This is a feature that we would also be interested in.

finos / waltz