uwdata / arquero

Query processing and transformation of array-backed data tables.
https://idl.uw.edu/arquero
BSD 3-Clause "New" or "Revised" License
1.22k stars 64 forks source link

arquero op.first_value function issue #285

Closed novotny1akub closed 1 year ago

novotny1akub commented 1 year ago

I am having trouble using op.first_value from arquero. Below is an example that demonstrates the issue. When using, for example, .rollup({sum: op.sum('value')}) instead of .rollup({first_value: op.first_value('value')}), it works fine so I assume this could be a bug in op.first_value because other aggregate functions work fine.

import { aq, op } from '@uwdata/arquero';

aq.table({
    group: ['a', 'a', 'b', 'b', 'b', 'c', 'c'],
    value: [5, 4, 7, 6, 9, 3, 8]
  })
  .groupby('group')
  .rollup({first_value: op.first_value('value')})
jheer commented 1 year ago

Hi, first_value is a window function, not an aggregate function, and so only applies in contexts that support ordered windows of values (e.g., derive, but not rollup). The correct behavior is to throw an error when invoking a window function in a non-window context.

The error message, however, should have been much clearer. Your post helped me to find a bug regarding error checking that I've fixed in #288. Thanks!

novotny1akub commented 1 year ago

Thanks a lot for the explanation, @jheer.