The next step is to setup end-to-end testing for compiled scripts like this one.
This entails:
Setting up testing for compiled scripts (i.e. compiling and running them)
Deciding where outputs of these scripts should be stored (this probably requires an return_postvalidator for the final stage of the script, which specified a serialization target)
Test harness for loading results and making assertions against them
Introducing the notion of a "non io" workflow that loads a dataset from parquet (i.e., substituting a gdf_from_parquet distributed task, so we can run these tests quickly and reproducibly
Note: The "in current state" in the title of this workflow refers to the fact that I want to test the compilation spec as it currently exists in the repo (without a visual output, just a dataframe output).
We are already able to compile a sequential script for time density, e.g.:
https://github.com/wildlife-dynamics/ecoscope-workflows/blob/main/examples/dags/scripts-sequential/time_density_script_sequential.py
The next step is to setup end-to-end testing for compiled scripts like this one.
This entails:
return_postvalidator
for the final stage of the script, which specified a serialization target)gdf_from_parquet
distributed task, so we can run these tests quickly and reproduciblyNote: The "in current state" in the title of this workflow refers to the fact that I want to test the compilation spec as it currently exists in the repo (without a visual output, just a dataframe output).