misken / hillmaker

Occupancy analysis by time of day and day of week, with Python
MIT License
6 stars 4 forks source link

Add option to add categories to plotting API #52

Open jwnorm opened 11 months ago

jwnorm commented 11 months ago

I did something like this already in a Jupyter Notebook, just need to decide the best way to integrate it into the package. Ideally, I would like it to be something as simple as by_category=True in the existing functions/methods (like in the other OO functions), but the code may be too verbose and might require a separate, make_stacked_hills function or something.

jwnorm commented 11 months ago
plt.style.use('seaborn-darkgrid')
fig1 = plt.figure(figsize=(15, 10))
ax1 = fig1.add_subplot(1, 1, 1)

# infer number of days being plotted
rows = len(summary_df) / len(sorted_dict.keys())
num_days = rows / (60 / bin_size_minutes * 24)

# Create a list to use as the X-axis values
num_bins = num_days * 1440 / bin_size_minutes
base_date_for_first_dow = '01/05/2015'  # Pick any date with associated DOW you want to appear first on plot
timestamps = pd.date_range(base_date_for_first_dow, periods=num_bins, freq=f'{bin_size_minutes}Min').tolist()

# Choose appropriate major and minor tick locations
major_tick_locations = pd.date_range(f'{base_date_for_first_dow} 12:00:00', periods=7, freq='24H').tolist()
minor_tick_locations = pd.date_range(f'{base_date_for_first_dow} 06:00:00', periods=42, freq='4H').tolist()

# Set the tick locations for the axes object
ax1.set_xticks(major_tick_locations)
ax1.set_xticks(minor_tick_locations, minor=True)

# manually add in pct line
pct = df_dict['summaries']['nonstationary']['dow_binofday']['occupancy']['p95'].to_numpy()

# Styling of bars, lines, plot area
# Style the bars for mean occupancy
bar_color = 'steelblue'

# Add data to the plot
# Mean occupancy as bars - here's the GOTCHA involving the bar width
bar_width = 1 / (1440 / bin_size_minutes)

cumulative_array = np.zeros(int(rows))
for i in sorted_dict.keys():
    ax1.bar(timestamps, sorted_dict[i], label=i, width=bar_width, bottom=cumulative_array)
    cumulative_array += sorted_dict[i]

ax1.plot(timestamps, pct, linestyle=pctile_line_style, label=f'95th %ile Occupancy', color=pctile_color)
jwnorm commented 11 months ago

Alternatively, wondering if it would be easier to understand if a category-based plot was broken into small multiples. Sometimes it is difficult to track the size of the stacked bars since they don't line up perfectly.

misken commented 11 months ago

I've wanted a stacked bar plot by category for a while now too. It's probably complex enough that it will end up needing its own function but, maybe not.

I also like idea of a faceted version. They get at different things. Stacks better for how the parts contribute to the whole, but facets better for the details of each part.