Open GillesTo opened 6 months ago
Hey there, @GillesTo! 👋 I'm here to help you with any software bugs, questions, or contributions while we wait for a human maintainer. Let's squash those bugs together!
To resolve the day offset issue in the calendar heatmap visualization in Apache Superset version 3.1.2, you need to ensure that the timestamp normalization is correctly handled. The issue might be due to the way timestamps are processed and normalized.
In the CalHeatmapViz
class, the get_data
method processes the timestamps. You should ensure that the timestamps are correctly normalized to avoid any offset issues. Here is the relevant part of the code that you might need to adjust:
class CalHeatmapViz(BaseViz):
"""Calendar heatmap."""
viz_type = "cal_heatmap"
verbose_name = _("Calendar Heatmap")
credits = "<a href=https://github.com/wa0x6e/cal-heatmap>cal-heatmap</a>"
is_timeseries = True
@deprecated(deprecated_in="3.0")
def get_data(self, df: pd.DataFrame) -> VizData: # pylint: disable=too-many-locals
if df.empty:
return None
form_data = self.form_data
data = {}
records = df.to_dict("records")
for metric in self.metric_labels:
values = {}
for query_obj in records:
v = query_obj[DTTM_ALIAS]
if hasattr(v, "value"):
v = v.value
# Ensure the timestamp is correctly normalized
v = pd.to_datetime(v, unit='s').normalize()
values[str(v.timestamp())] = query_obj.get(metric)
data[metric] = values
try:
start, end = get_since_until(
relative_start=relative_start,
relative_end=relative_end,
time_range=form_data.get("time_range"),
since=form_data.get("since"),
until=form_data.get("until"),
)
except ValueError as ex:
raise QueryObjectValidationError(str(ex)) from ex
if not start or not end:
raise QueryObjectValidationError(
"Please provide both time bounds (Since and Until)"
)
domain = form_data.get("domain_granularity")
diff_delta = rdelta.relativedelta(end, start)
diff_secs = (end - start).total_seconds()
if domain == "year":
range_ = end.year - start.year + 1
elif domain == "month":
range_ = diff_delta.years * 12 + diff_delta.months + 1
elif domain == "week":
range_ = diff_delta.years * 53 + diff_delta.weeks + 1
elif domain == "day":
range_ = diff_secs // (24 * 60 * 60) + 1 # type: ignore
else:
range_ = diff_secs // (60 * 60) + 1 # type: ignore
return {
"data": data,
"start": start,
"domain": domain,
"subdomain": form_data.get("subdomain_granularity"),
"range": range_,
}
In addition to the above, you should also check the Calendar
function in the Calendar.js
file to ensure that the timezone offset is correctly handled. Here is the relevant part of the code:
function Calendar(element, props) {
const {
data,
height,
cellPadding = 3,
cellRadius = 0,
cellSize = 10,
domainGranularity,
linearColorScheme,
showLegend,
showMetricName,
showValues,
steps,
subdomainGranularity,
timeFormatter,
valueFormatter,
verboseMap,
theme,
} = props;
const container = d3Select(element)
.classed('superset-legacy-chart-calendar', true)
.style('height', height);
container.selectAll('*').remove();
const div = container.append('div');
const subDomainTextFormat = showValues
? (date, value) => valueFormatter(value)
: null;
const metricsData = data.data;
const METRIC_TEXT = t('Metric');
Object.keys(metricsData).forEach(metric => {
const calContainer = div.append('div');
if (showMetricName) {
calContainer.text(`${METRIC_TEXT}: ${verboseMap[metric] || metric}`);
}
const timestamps = metricsData[metric];
const extents = d3Extent(Object.keys(timestamps), key => timestamps[key]);
const step = (extents[1] - extents[0]) / (steps - 1);
const colorScale = getSequentialSchemeRegistry()
.get(linearColorScheme)
.createLinearScale(extents);
const legend = d3Range(steps).map(i => extents[0] + step * i);
const legendColors = legend.map(x => colorScale(x));
const cal = new CalHeatMap();
cal.init({
start: data.start,
data: timestamps,
itemSelector: calContainer.node(),
legendVerticalPosition: 'top',
cellSize,
cellPadding,
cellRadius,
legendCellSize: cellSize,
legendCellPadding: 2,
legendCellRadius: cellRadius,
tooltip: true,
domain: domainGranularity,
subDomain: subdomainGranularity,
range: data.range,
browsing: true,
legend,
legendColors: {
colorScale,
min: legendColors[0],
max: legendColors[legendColors.length - 1],
empty: theme.colors.grayscale.light5,
},
displayLegend: showLegend,
itemName: '',
valueFormatter,
timeFormatter,
subDomainTextFormat,
});
});
}
Calendar.displayName = 'Calendar';
Calendar.propTypes = propTypes;
export default Calendar;
This function initializes the calendar heatmap and ensures that the timestamps are correctly processed and displayed. Make sure this function is correctly implemented in your Calendar.js
file to avoid any day offset issues [1][2][3].
To continue talking to Dosu, mention @dosu.
Hi @GillesTo. Given that we don't have access to this data, could you reproduce the bug using our examples dataset? The test environment for 4.0 is available here. You can create and save a chart demonstrating the problem and share its name here and we'll take a look.
Bug description
the problem has already been discussed in https://github.com/apache/superset/pull/24989
but the problem is remaining :
I'm using the 3.1.2 version of superset installed in the container from here : https://superset.apache.org/docs/quickstart
in this case I count the number of lines "created_at" for each day. If you have a look at the picture, I'm supposed to count 6 for the 2024-05-31
but if you have a look at the result, it is nicely counting 6, but 2024-05-31 changed into 2024-05-30
How to reproduce the bug
data are :
collected from a python program, and stored in a dataframe with this command : df['created_at'] = pd.to_datetime(df['created_at'], errors='coerce', utc=True)
then the dataframe is sent to a posgresql database using this function : df.to_sql(table_name, engine, index=False, if_exists='replace')
in superset :
Screenshots/recordings
No response
Superset version
3.1.3
Python version
3.9
Node version
16
Browser
Firefox
Additional context
No response
Checklist