brunorosilva / plotly-calplot

The easiest and best looking Calendar Heatmap you'll find, made with Plotly.
https://pypi.org/project/plotly-calplot/
104 stars 10 forks source link

Plot Categorical Data #17

Closed soerenetler closed 1 year ago

soerenetler commented 1 year ago

Hello everyone, thank you for this great project. The tool works great and helps a lot in visualizing visitor numbers. Is it also possible to plot categorical data? If I just use a column with string values this does not work (because no maximum value is defined). This would be really helpful to plot e.g. days with low, medium and high counts of visitors. I know it is possible to work around this using hacks with the color scale, but another way would be much easier.

Thank you for all your work, Sören

brunorosilva commented 1 year ago

Hi, thanks for the feedback.

I've been on a personal break so far and I'll work on this feature on this weekend. It makes sense for this to exist. A simple workaround is to create ordered categories for your continuous variable.

Examples

Setup

import pandas as pd
from plotly_calplot import calplot
import numpy as np
dummy_start_date = "2022-01-01"
dummy_end_date = "2023-10-03"
dummy_df = pd.DataFrame(
    {
        "ds": pd.date_range(dummy_start_date, dummy_end_date),
        "value": np.random.randint(
            0,
            30,
            (pd.to_datetime(dummy_end_date) - pd.to_datetime(dummy_start_date)).days
            + 1,
        ),
    }
)

Categorizing linearly (creates no emphasis)

def continuous_to_categoric(v):
    if v < 10:
        return 0
    elif v < 20:
        return 1
    return 2

dummy_df["cat_value"] = dummy_df["value"].apply(lambda x: continuous_to_categoric(x))
fig1 = calplot(dummy_df, x="ds", y="cat_value", dark_theme=False, years_title=True)
image

Categorizing with lower bound (emphasis on higher values)

def continuous_to_categoric(v):
    if v < 10:
        return 0
    elif v < 20:
        return 1
    return 5

dummy_df["cat_value"] = dummy_df["value"].apply(lambda x: continuous_to_categoric(x))
fig1 = calplot(dummy_df, x="ds", y="cat_value", dark_theme=False, years_title=True)
image

Categorizing with higher bound (emphasis on lower values)

def continuous_to_categoric(v):
    if v < 5:
        return 5
    elif v < 20:
        return 1
    return 0

dummy_df["cat_value"] = dummy_df["value"].apply(lambda x: continuous_to_categoric(x))
fig1 = calplot(dummy_df, x="ds", y="cat_value", dark_theme=False, years_title=True, colorscale="RdPu")
image
soerenetler commented 1 year ago

Thank you a lot for the detailed code examples. This helps to create the plot, but the colorscale is afterwards completely off and needs to be created separately. But it is a good workaround for now.