Open DrNickClarke opened 2 months ago
A simple example would be to get the max value in a column without reading all the data.
Missing data (NaNs) should be ignored.
The current workaround is to create a synthetic column with a fixed value and then groupby the new column and apply the aggregator.
This works well but the syntax is not clear enough.
An example of the workaround is
np.random.seed(13) qb_whole_col_df = pd.DataFrame(data={'val': np.random.uniform(0., 100., 25)}) qb_whole_col_sym = 'qb_whole_col_sym' lib.write(qb_whole_col_sym, qb_whole_col_df) q_wc = adb.QueryBuilder() q_wc = q_wc.apply('zero', q_wc['val']*0).groupby('zero').agg({'val': 'max'}) lib.read(qb_whole_col_sym, query_builder=q_wc).data
In future we will make this possible with cleaner syntax.
A simple example would be to get the max value in a column without reading all the data.
Missing data (NaNs) should be ignored.
The current workaround is to create a synthetic column with a fixed value and then groupby the new column and apply the aggregator.
This works well but the syntax is not clear enough.
An example of the workaround is
In future we will make this possible with cleaner syntax.