Open ssathi opened 3 years ago
Thanks for reporting.
Tested this and it looks to be a bug in the new BufferHashMap based implementation that reduces memory overhead for large aggregates and DISTINCT. For now you can switch to using the older implementation "set snappydata.sql.optimizedHashAggregate=false" which is as fast (and in many cases faster) than the newer one albeit may fail for very large aggregation/DISTINCT results. If this works for your use-cases it is much better than turning it off completely.
Aggregating negative decimal values produces incorrect results.
table:
Spark SQL query:
After disabling snappydata.sql.hashAggregateSize=-1 snappydata.sql.useOptimizedHashAggregateForSingleKey=false
it produces correct values.