edanalytics / earthmover

CLI tool for transforming collections of tabular source data into a variety of text-based data formats via YAML configuration and Jinja templates.
Apache License 2.0
19 stars 2 forks source link

Feature: `json_array_agg` #112

Closed johncmerfeld closed 3 months ago

johncmerfeld commented 3 months ago

Pertaining to this ticket, adds a json_array_agg function to the group_by operator so that users can generate JSON arrays in a single step.

Adds a small test case to earthmover -t that validates this operation. Also fixes a small bug that caused the test code to fail if there was no outputs directory present, and another bug that resulted in some missing data in the animals output.

Additionally, adds information in the README about earthmover init and earthmover clean, which I neglected to do earlier.

Discussion

The way json_array_agg uses _get_agg_lambda's separator argument is questionable. I think its best to leave that argument named separator for the time being instead of thinking of a more generalizable name and/or structure for the _sep object. I've tried to document this quirk, but if reviewers think this is an overload gone too far, I can try to come up with something less ambiguous.