microsoft / Trill

Trill is a single-node query processor for temporal or streaming data.
MIT License
1.24k stars 133 forks source link

Fixing Afa operator stale pointers #141

Closed peterfreiling closed 3 years ago

peterfreiling commented 3 years ago

Bug discovered in a customer job via an Out of Order exception, where the sync time was set to 0. The root cause is that some Afa operators cache pointers to output batch columns as local variables, but failed to refresh those cached locals when the output batch was flushed and reallocated.

For ASA to hit this there were many conditions that must be satisfied. This explains why it took hours to hit a repro for the customer job.

From code inspection, Afa operators are the only ones that have such a bug.