ricklupton / ipysankeywidget

IPython / Jupyter Sankey diagram widget
MIT License
174 stars 24 forks source link

Error on align types #34

Open harisbal opened 6 years ago

harisbal commented 6 years ago

Hello,

I'm facing a weird bug when I try to align different links. It seems that for some reason including numbers in the type name causes some issues. Example: The following doesn't work properly

d1 = [
 {'source': 'start', 'target': 'E02000314_1', 'value': 32.0, 'type': 't41037'},
 {'source': 'E02000314_1', 'target': 'E09000025_2', 'value': 32.0, 'type': 't41037'},
 {'source': 'E09000025_2', 'target': 'end', 'value': 32.0, 'type': 't41037'},

 {'source': 'start', 'target': 'E02000313_1', 'value': 15.0, 'type': 't39174'},
 {'source': 'E02000313_1', 'target': 'E09000025_2', 'value': 15.0, 'type': 't39174'},
 {'source': 'E09000025_2', 'target': 'E02000321_3', 'value': 15.0, 'type': 't39174'},
 {'source': 'E02000321_3', 'target': 'end', 'value': 15.0, 'type': 't39174'}]

sankey(links=d1, align_link_types=True)

while this does:

d2 = [
 {'source': 'start', 'target': 'E02000314_1', 'value': 32.0, 'type': 'Α'},
 {'source': 'E02000314_1', 'target': 'E09000025_2', 'value': 32.0, 'type': 'Α'},
 {'source': 'E09000025_2', 'target': 'end', 'value': 32.0, 'type': 'Α'},

 {'source': 'start', 'target': 'E02000313_1', 'value': 15.0, 'type': 'Β'},
 {'source': 'E02000313_1', 'target': 'E09000025_2', 'value': 15.0, 'type': 'Β'},
 {'source': 'E09000025_2', 'target': 'E02000321_3', 'value': 15.0, 'type': 'Β'},
 {'source': 'E02000321_3', 'target': 'end', 'value': 15.0, 'type': 'Β'}]

sankey(links=d2, align_link_types=True)

I'm happy to try and find a solution but I'd need some initial guidance as to where I should look for. I have a feeling this is an issue of js rather than python right?

Cheers

ricklupton commented 6 years ago

Hi @harisbal, thanks for the issue and sorry for the slow reply!

Yes, you're right this is probably in JS rather than Python -- specifically probably in the d3-sankey-diagram package, which ipysankeywidget wraps around.

What happens is that the align_link_types option sets the "ports" of the links -- link to code.

So I would probably start by testing directly in d3-sankey-diagram whether you see the same issue, using a list of links with sourcePort and targetPort set like this:

d1 = [
 {'source': 'start', 'target': 'E02000314_1', 'value': 32.0, 'sourcePort': 't41037', 'targetPort': 't41037'},
...
]

Hope that makes sense. If you're still up for trying to find a solution, please do have a go and ask if anything's not clear! You might find the example here useful for setting up a test with d3-sankey-diagram.

ricklupton commented 6 years ago

Update: I think it's not the numbers in the type name that are the issue, it's just the sort order of the type names.

I'm not sure at the moment how this is causing a problem.

harisbal commented 5 years ago

Hi @ricklupton Can you please help me understand why dummyNodes are used? I have a feeling that for some reason a not required dummyNode is created in this example. When I disable this part (by always returning an empty dummyNodes) the links are properly aligned.

harisbal commented 5 years ago

Moreover, how could I test my changes to js in the ipysankeywidget. Modifying js files in js/lib does not seem to have any effect. Apologies for all these questions..

ricklupton commented 5 years ago

Hi @harisbal, thanks for looking into it more!

"Dummy nodes" are used to deal with links that skip across layers. For example in this graph:

    |a| ---> |b| ---> |c|
       `------------> | |

The long link from a to c would potentially overlap with node b if it was drawn straight from a to c. So the graph is transformed by adding dummy nodes (*) to make each link go between only adjacent layers:

    |a| ---> |b| ---> |c|
       `---> |*|----> | |

I'm not sure exactly how that would affect the problem you're seeing. It does lose the connection between link order at either end of long links.

I'm hoping to have time to look more at this in the next couple of weeks. Have you made any progress?

Apologies for all these questions

No problem!

harisbal commented 5 years ago

This version does the trick although it is quite intrusive and it probably breaks other examples.

In an nutshell I sort incoming and outgoing edges inside the prepareNodePorts function (based on the port's name) and disable the orderEdges function.

Here you can find the example I'm working on