Closed jfmcquade closed 1 week ago
Looks fine to me, although still have the issue in the back of my mind as to whether it actually makes sense to use the args list this way (given we have the input_source column), particularly given potential long-form args_list syntax in the future, e.g.
merge_source: list_b; join_type: outer
https://chat.idems.international/all/pl/bji1h4pxbfbzdeiso9pzcx46re
Thanks, we discussed this on a call and decided that it would be better not to support the authoring style of providing a list of data sources in the args_list
and no data source in the input_source
, for the reasons you have given. I have reworked this PR, see the updated description.
We noted that the Data Pipelines RFC includes the proposal to support an explicit key/value syntax for the arguments passed to the args_list
, which we approved of.
We also noted that it might make sense for cases like the merge
operator to support passing a list of input sources to the input_source
column (although not to the args_list
column). And also that there may be good reasons to also support a list of outputs to the output_target
column. I'm not sure where the best place to capture this is, potentially as part of that RFC, or I could write a new issue.
https://github.com/IDEMSInternational/open-app-builder/issues/2341 also suggests the existence of another possible issue to do with applying the merge operator to data lists that contain nested JSON data. I think it is sensible to handle this case as a follow-up if it does persist after applying the changes of this PR. In that case, a new debug sheet that exhibits the broken behaviour would be required.
Debug sheet: example_data_pipe_merge_nested which merges a data list with a translatable column (which is nested as .json)
with another data list. Syncing this gives the error:
I think this is good to merge - I'll address the specific issue raised by @esmeetewinkel in a follow-up PR
PR Checklist
Description
The data pipe
merge
fails and throws an unhandled error in the case that noinput_source
is provided. This PR handles that case by throwing a custom error in that case.Also tidies some of the pre-existing data pipe operator tests (thanks, @chrismclarke).
2341 also suggests the existence of another possible issue to do with applying the merge operator to data lists that contain nested JSON data. I think it is sensible to handle this case as a follow-up if it does persist after applying the changes of this PR. In that case, a new debug sheet that exhibits the broken behaviour would be required.
Testing
See example_data_pipe_merge. Run a sync and ensure no errors are thrown, then inspect the outputs of the data pipe to confirm they are as expected.
Git Issues
Closes #2341
Screenshots/Videos
Attempting to run the data pipe as defined by example_data_pipe_merge in the following state:
now logs the following error:
Passing tests