plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16k stars 2.53k forks source link

px animations does not show correctly colors if all colors are not in the first frame #2259

Open emmanuelle opened 4 years ago

emmanuelle commented 4 years ago

For example

import pandas as pd
import plotly.express as px
dataVals = {
    'Lat': [39.783730, 7, 39.783730, 39.783730, 20, -4.03, 39.783730, 36.82, 39.783730],
    'Lon': [-100.445882, 66, -100.445882, -100.445882, 70, 5.33, -100.445882, -1.29, -100.445882],
    'Value': [40, 12, 22, 3, 60, 23, 30, 100, 200],
    'Year': ['1985', '1990', '1990', '1990', '1990', '1990', '2000', '2000', '2000'],
    'Continent Color': ["a", "b", "a", "a", "b", "c", "b", "d", "a"]
    }
data = pd.DataFrame(dataVals)

fig = px.scatter_geo(data, lat="Lat", lon="Lon", size="Value", color="Continent Color",
                     animation_frame="Year", animation_group="Continent Color",
                     projection="natural earth", size_max=200)

fig.show()

(all points are displayed with the color of continent "a"). Reported by https://community.plot.ly/t/scatter-geo-only-shows-values-with-a-certain-color-if-i-have-multiple-years-as-the-same-year/35976. Other functions such as px.scatter have the same problem.

If all colors are used for the first frame, the problem disappears.

guidocioni commented 4 years ago

I think I have a similar problem. I have a dataframe with elements with different states (susceptible, infected or recovered) at different time steps. Of course at the first time step there is no recovered, but only susceptible or infected, so plotly only computes the labels based on the first time step which yields susceptible, infected but not recovered. As a result recovered are never shown.

MishaVeldhoen commented 4 years ago

I am experiencing the same issue when making a choropleth. However, the problem isn't completely solved when all colors are shown in the first frame. In the following minimal example, all colors are included in the first frame, the second frame should only have a single color. As shown in the attached image, also in the second frame two colors are shown.

import pandas as pd
import plotly.express as px

test_gjson = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "id": "N",
            "geometry": {
                "type": "Polygon",
                "coordinates": [[[-1, 0], [0, 1], [1, 0], [-1, 0]]]},
            "properties": {}
        },
        {
            "type": "Feature",
            "id": "S",
            "geometry": {
                "type": "Polygon",
                "coordinates": [[[-1, 0], [1, 0], [0, -1], [-1, 0]]]},
            "properties": {}
        }
    ]
}

test_df = pd.DataFrame.from_dict({
    "year": ["2000", "2000", "2010", "2010"],
    "id": ["N", "S", "N", "S"],
    "val": ["a", "b", "a", "a"],
})

px.choropleth(
    data_frame=test_df,
    geojson=test_gjson,
    color="val",
    locations="id",
    animation_frame="year",
).update_geos(fitbounds="locations", visible=False)

image

nicolaskruchten commented 4 years ago

I've just updated our documentation to highlight known limitations of our animation features. The specific one that's of note here is that animations are designed to work well when each row of input is present across all animation frames, and when categorical values mapped to symbol, color and facet are constant across frames. Animations may be misleading or inconsistent if these constraints are not met.

hlgirard commented 4 years ago

Are there any plans to improve this feature to make it work as expected? I use animation_frame a lot to group data that belongs together in conjunction with color and would love to be able to leverage the expected behavior.

nicolaskruchten commented 4 years ago

There's definitely a lot we could do to improve this feature but there are no low-hanging fruits here, and we've got nothing planned on our roadmap for the next few months related to animations. We would happily accept pull requests into the Plotly.js library that does the actual animation, or sponsorship to be able to put these features on our roadmap.

Sharamj commented 3 years ago

I have the same problem but I do not know how to manage it. Is there any suggestion from your side?

alphaps commented 3 years ago

same here!

perstattin commented 2 years ago

Same problem here. In addition, I noticed that if a color is not present in a subsequent animation frame of x, the colors and xy values of the previous animation frame of x will be shown.

ez2rok commented 2 years ago

I have the same problem.

I am trying to create an animation of depth-first search where nodes are in three different colors that represent if they are not discovered, discovered but not finished, or finished. In each frame the color of a single node changes as that node goes from being not discovered to being discovered but not finished or goes from being discovered but not finished to being finished.

Initially all nodes are not discovered so they are a single color. But in the next frame, as the first nodes changes color from being not discovered to being discovered but not finished, the the node disappears and is not displayed. In fact, as each nodes is discovered, it disappears from the animation and is not displayed.

gabgilling commented 2 years ago

I've had the same issue when calling choropleth_mapbox. Certain polygon colors only appear after a certain amount of time elapses in the animation frame, which caused polygons whose color did not appear on the first timeframe to disappear entirely. My fix was to concatenate a couple of dummy rows to my dataframe with the colors that are not present on the first timeframe. For instance, all my data for 2021-06 had threshold_color red, which caused polygons with colors yellow and green to disappear, so I added the following lines to my dataframe:

geodataframe = pd.concat([geodataframe, pd.DataFrame({'FIPSID': 'foo', 'date': '2021-06', 'count': 0, 'threshold_color': 'yellow'},index=[462])])
geodataframe = pd.concat([geodataframe, pd.DataFrame({'FIPSID': 'bar', 'date': '2021-06', 'count': 0, 'threshold_color': 'green'}, index=[463])])
fingerpartyy commented 1 year ago

hello @emmanuelle . I found a workaround for this, because i am also having trouble with animation on px. I didn't modified the df, but i changed the approach of the color variable from categorical to numeric:

Here is the modified code. Hopefully it will help others:

def dummy_scatter():
    dataVals = {
        'Lat': [39.783730, 7, 39.783730, 39.783730, 20, -4.03, 39.783730, 36.82, 39.783730],
        'Lon': [-100.445882, 66, -100.445882, -100.445882, 70, 5.33, -100.445882, -1.29, -100.445882],
        'Value': [40, 12, 22, 3, 60, 23, 30, 100, 200],
        'Year': ['1985', '1990', '1990', '1990', '1990', '1990', '2000', '2000', '2000'],
        'Continent Color': ["a", "b", "a", "a", "b", "c", "b", "d", "a"]
        }
    data = pd.DataFrame(dataVals)
    data['ContinentColorNumber']=data['Continent Color']
    colordictContinent={'a':1,'b':2,'c':3,'d':4}
    data.ContinentColorNumber.replace(colordictContinent,inplace=True)

    fig = px.scatter_geo(data, 
                         lat="Lat", lon="Lon", 
                         size="Value", 
                         color="ContinentColorNumber",
                         color_continuous_scale='thermal',
                         range_color=(data.ContinentColorNumber.min(),data.ContinentColorNumber.max()),
                         projection="natural earth", 
                         animation_frame="Year",
                         size_max=20)

    return fig

dummy_scatter()

Capture2

johnthatfield commented 1 year ago

+1 encounter the same error when coloured attribute changes over time for the datasets

Even when you have all colours present in the first animation frame, when navigating through subsequent frames of data, for some record it will stack two records of data from different animation frames.

bernardocecchetto commented 1 year ago

I am facing the same problem. I have a person that got covid in a specific timestamp. My goal is to plot a scatter plot showing to the evolution of the data. The problem is that, in the beginning the person does not have covid, and, at some specific time, he gets covid. But only the label "nocovid" is being displayed in the animation, and also its data.

wbeardall commented 1 year ago

+1 this bug also likely contributes to an issue I'm seeing, where animation_group is not respected when the colored attribute changes over time within the same animation_group.

In this case, the issue presents itself as points fading in and out of existence between frames, rather than moving around on the plot as expected.

import plotly.express as px
import numpy as np
import pandas as pd

points = 100
frames = 40
colors=5

df = pd.DataFrame([dict(
    x=np.random.rand(), y=np.random.rand(),color=str(np.random.randint(colors)),
    animation_group=str(i), animation_frame=f
) for i in range(points) for f in range(frames)])
fig = px.scatter(df, x="x", y="y", animation_frame="animation_frame", animation_group="animation_group",
           color="color", range_x=[0,1], range_y=[0,1])
fig.show()

plotly_animation_group_bug

rajaahdjey commented 8 months ago

Facing a similar issue : So far this is what I understand . Everything here is after ensuring there are the same number of datapoints in the same order of hover_name attribute. Also I am having both color and symbol fields as categorical. The color field remains same across all the frames, but the symbol field can change occasionally.

  1. First frame needs to have all combination of the legend (color and symbol cols) - if not, right from first frame there are additional points of the previous tick . I am able to work around by creating the first tick manually with a bunch of random values but different combinations of legends .

  2. Whenever there is change in state of a legend item for the same row item for symbol between one tick to next, the old tick remains on the screen for the next few frames and then those old ticks disappear (not sure on what triggers that ) . Interestingly , this doesn't happen on the first instance of the symbol change (eg. alive to dead, but happens on the frame when it changes again to alive)

  3. If I add animation_group for the rows by an identifier, say hover_name value - there is no improvement. The moment symbol changes the second time, the existing tick remains as an unwanted artifact.

Is there any way to force a redraw?