plotly / plotly.js

Open-source JavaScript charting library behind Plotly and Dash
https://plotly.com/javascript/
MIT License
16.83k stars 1.85k forks source link

Weird ordering of elements in Parallel Categories Plot #6803

Open malteschwerin opened 9 months ago

malteschwerin commented 9 months ago

grafik I'm currently using the parallel categories plot in plotly express to display and compare different rankings of elements. I know I'm slightly misusing this plot here, but I still think that my problem should be relevant to others as well. In the example I attached, for simplicity I'm only showing one ranking ("Ranking 1"), where competitor 0 is on the first place and the other competitors share second place. In practice, I would do this with multiple rankings as parallel coordinates, which usually have fewer and fewer distinct ranks going from left to right. As you can also see in the image, the competitors on rank 2 have a weird ordering for some reason, which is why I'm writing here. The dataframe I'm using here looks like this: index Name Ranking 1 0 0 1 1 1 2 2 2 2 3 3 2 4 4 2 5 5 2 6 6 2 7 7 2 8 8 2 9 9 2 10 10 2 11 11 2 12 12 2 13 13 2 14 14 2 15 15 2 16 16 2 17 17 2 18 18 2 19 19 2 20 20 2 21 21 2

While I usually change some parameters, it looks exactly like this when just calling px.parallel_categories(px). So I really don't see why any of the connecting lines are overlapping. It would be great if I could use this plot like this without creating extra confusion coming from crossing lines.

I used version 5.9 before, but just updated to the latest version (5.18.0) and the issue remains.

Coding-with-Adam commented 9 months ago

hi @malteschwerin Can you please share a minimal reproducible example so we can replicate this issue locally?

malteschwerin commented 9 months ago

Hi @Coding-with-Adam, thanks for the quick reply. Sure, here you have a minimal example:

import pandas as pd
import plotly.express as px
df = pd.DataFrame({"Names": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], "Ranking 1": [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]})
px.parallel_categories(df)

and this is what the result looks like: grafik

Now that I look at it, it seems like a lexicographical sorting might be used in the second column ("Ranking 1") while the first column ("Names") is sorted numerically. Changing all values to strings, however, doesn't change anything. Ideally, I would want to have the same ordering in all columns, reducing the number of crossing lines. In this simple example, I think it's clear that no lines should cross at all.

Coding-with-Adam commented 9 months ago

I agree, @malteschwerin. Ideally, these lines would not cross. I brought your question to the Plotly community, and one of the community members pointed out that by reversing the lists, the graph is created without crossing lines.

import pandas as pd
import plotly.express as px

names = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ranking =  [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

ranking = ranking[::-1]
names=names[::-1]

df = pd.DataFrame({"Names": names, "Ranking 1": ranking})
fig = px.parallel_categories(df)
fig.update_layout(height=700)
fig.show()

image

However, we should be able to prevent the lines from crossing without reversing the lists.

@LiamConnors do you have any idea why this might be happening?

LiamConnors commented 9 months ago

Looks like this issue exists in Plotly.js - see the following codepen: https://codepen.io/Liam-Connors/pen/ZEwMQRr Transferring it to the Plotly.js repo as a bug. cc @archmoj

malteschwerin commented 9 months ago

Hey @Coding-with-Adam @LiamConnors, thanks for your work so far! Do you have any guess on when someone will look at this in detail?

Coding-with-Adam commented 9 months ago

hi @malteschwerin We should be able to take a deeper look at this bug in December. We'll keep you updated.

malteschwerin commented 8 months ago

Hey @Coding-with-Adam , is there any update on this already?

Coding-with-Adam commented 8 months ago

no update yet, @malteschwerin . We'll take a deeper look after the holidays. cc @LiamConnors @archmoj

malteschwerin commented 7 months ago

Hey @Coding-with-Adam, is there an update on this? I'm currently writing a paper using these plots, and it would be really helpful to be able to use them wihtout these crossing lines.

Coding-with-Adam commented 7 months ago

Thanks for the reminder, @malteschwerin I'll follow up wit the team today.

Coding-with-Adam commented 7 months ago

@malteschwerin Unfortunately, the resources needed to tackle this issue are limited given higher priority issues.

Would you be able to use the workaround mentioned here? If not, would you be able to submit a Pull Request for this issue?

malteschwerin commented 7 months ago

@Coding-with-Adam I'm afraid the workaround you're mentioning doesn't work for me, since it would look very unintuitive to have the rankings upside-down. Unfortunately, I'm also too busy right now to have a look at this myself. Sorry!