apache / superset

Apache Superset is a Data Visualization and Data Exploration Platform
https://superset.apache.org/
Apache License 2.0
60.85k stars 13.18k forks source link

Calendar Heatmap day offset #28931

Open GillesTo opened 1 month ago

GillesTo commented 1 month ago

Bug description

the problem has already been discussed in https://github.com/apache/superset/pull/24989

but the problem is remaining :

How to reproduce the bug

Screenshots/recordings

No response

Superset version

3.1.3

Python version

3.9

Node version

16

Browser

Firefox

Additional context

No response

Checklist

dosubot[bot] commented 1 month ago

Hey there, @GillesTo! 👋 I'm here to help you with any software bugs, questions, or contributions while we wait for a human maintainer. Let's squash those bugs together!

To resolve the day offset issue in the calendar heatmap visualization in Apache Superset version 3.1.2, you need to ensure that the timestamp normalization is correctly handled. The issue might be due to the way timestamps are processed and normalized.

In the CalHeatmapViz class, the get_data method processes the timestamps. You should ensure that the timestamps are correctly normalized to avoid any offset issues. Here is the relevant part of the code that you might need to adjust:

class CalHeatmapViz(BaseViz):
    """Calendar heatmap."""

    viz_type = "cal_heatmap"
    verbose_name = _("Calendar Heatmap")
    credits = "<a href=https://github.com/wa0x6e/cal-heatmap>cal-heatmap</a>"
    is_timeseries = True

    @deprecated(deprecated_in="3.0")
    def get_data(self, df: pd.DataFrame) -> VizData:  # pylint: disable=too-many-locals
        if df.empty:
            return None

        form_data = self.form_data
        data = {}
        records = df.to_dict("records")
        for metric in self.metric_labels:
            values = {}
            for query_obj in records:
                v = query_obj[DTTM_ALIAS]
                if hasattr(v, "value"):
                    v = v.value
                # Ensure the timestamp is correctly normalized
                v = pd.to_datetime(v, unit='s').normalize()
                values[str(v.timestamp())] = query_obj.get(metric)
            data[metric] = values

        try:
            start, end = get_since_until(
                relative_start=relative_start,
                relative_end=relative_end,
                time_range=form_data.get("time_range"),
                since=form_data.get("since"),
                until=form_data.get("until"),
            )
        except ValueError as ex:
            raise QueryObjectValidationError(str(ex)) from ex
        if not start or not end:
            raise QueryObjectValidationError(
                "Please provide both time bounds (Since and Until)"
            )
        domain = form_data.get("domain_granularity")
        diff_delta = rdelta.relativedelta(end, start)
        diff_secs = (end - start).total_seconds()

        if domain == "year":
            range_ = end.year - start.year + 1
        elif domain == "month":
            range_ = diff_delta.years * 12 + diff_delta.months + 1
        elif domain == "week":
            range_ = diff_delta.years * 53 + diff_delta.weeks + 1
        elif domain == "day":
            range_ = diff_secs // (24 * 60 * 60) + 1  # type: ignore
        else:
            range_ = diff_secs // (60 * 60) + 1  # type: ignore
        return {
            "data": data,
            "start": start,
            "domain": domain,
            "subdomain": form_data.get("subdomain_granularity"),
            "range": range_,
        }

In addition to the above, you should also check the Calendar function in the Calendar.js file to ensure that the timezone offset is correctly handled. Here is the relevant part of the code:

function Calendar(element, props) {
  const {
    data,
    height,
    cellPadding = 3,
    cellRadius = 0,
    cellSize = 10,
    domainGranularity,
    linearColorScheme,
    showLegend,
    showMetricName,
    showValues,
    steps,
    subdomainGranularity,
    timeFormatter,
    valueFormatter,
    verboseMap,
    theme,
  } = props;

  const container = d3Select(element)
    .classed('superset-legacy-chart-calendar', true)
    .style('height', height);
  container.selectAll('*').remove();
  const div = container.append('div');

  const subDomainTextFormat = showValues
    ? (date, value) => valueFormatter(value)
    : null;

  const metricsData = data.data;

  const METRIC_TEXT = t('Metric');

  Object.keys(metricsData).forEach(metric => {
    const calContainer = div.append('div');
    if (showMetricName) {
      calContainer.text(`${METRIC_TEXT}: ${verboseMap[metric] || metric}`);
    }
    const timestamps = metricsData[metric];
    const extents = d3Extent(Object.keys(timestamps), key => timestamps[key]);
    const step = (extents[1] - extents[0]) / (steps - 1);
    const colorScale = getSequentialSchemeRegistry()
      .get(linearColorScheme)
      .createLinearScale(extents);

    const legend = d3Range(steps).map(i => extents[0] + step * i);
    const legendColors = legend.map(x => colorScale(x));

    const cal = new CalHeatMap();
    cal.init({
      start: data.start,
      data: timestamps,
      itemSelector: calContainer.node(),
      legendVerticalPosition: 'top',
      cellSize,
      cellPadding,
      cellRadius,
      legendCellSize: cellSize,
      legendCellPadding: 2,
      legendCellRadius: cellRadius,
      tooltip: true,
      domain: domainGranularity,
      subDomain: subdomainGranularity,
      range: data.range,
      browsing: true,
      legend,
      legendColors: {
        colorScale,
        min: legendColors[0],
        max: legendColors[legendColors.length - 1],
        empty: theme.colors.grayscale.light5,
      },
      displayLegend: showLegend,
      itemName: '',
      valueFormatter,
      timeFormatter,
      subDomainTextFormat,
    });
  });
}

Calendar.displayName = 'Calendar';
Calendar.propTypes = propTypes;

export default Calendar;

This function initializes the calendar heatmap and ensures that the timestamps are correctly processed and displayed. Make sure this function is correctly implemented in your Calendar.js file to avoid any day offset issues [1][2][3].

To continue talking to Dosu, mention @dosu.

michael-s-molina commented 1 month ago

Hi @GillesTo. Given that we don't have access to this data, could you reproduce the bug using our examples dataset? The test environment for 4.0 is available here. You can create and save a chart demonstrating the problem and share its name here and we'll take a look.