Closed Sohambutala closed 3 months ago
Attention: Patch coverage is 23.30275%
with 418 lines
in your changes missing coverage. Please review.
Please upload report for BASE (
main@9979b18
). Learn more about missing BASE report. Report is 18 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Major Upgrades to Echodataflow Pipeline
This PR introduces significant enhancements to the echodataflow pipeline, providing new capabilities and improving existing functionalities.
Highlights:
New Flows Added
write_output
: Directs the current output to a specific location or store.mask_prediction
: Utilizes an ML model to predict masks, which are then applied using theapply_mask
function.compute_NASC
: Calculates the Nautical Acoustic Scattering Coefficient (NASC) on the data.resample
: Resamples and stores the output at a lower or higher resolution.slice_store
: Slices the store based on time chunks, using Prefect variables to track the last sliced frame index. Currently supports only Zarr stores.Issue Resolution:
range_sample
is included in Zarr files.Blosc
encoding toZlib
which resolved Intermittent Blosc Decompression Error During Zarr OperationsSupport for New Input:
storepath
indatastore.yaml
instead of the traditionalurlpath
.window_size
: Number of samples to slice from a Zarr store (e.g., 500).time_travel_hours
: Number of hours to go back from the last timestamp in the Zarr store (e.g., 2 hours).time_travel_mins
: Number of minutes to go back from the last timestamp in the Zarr store (e.g., 30 minutes).rolling_size
: Number of samples to go back in a Zarr store to set a starting point for a new window (e.g., 100). For non-overlapping windows, setrolling_size
equal towindow_size
.External Parameters Support:
Added support for external parameters in all flows, enabling more customizable and dynamic workflow configurations. External parameters must match the underlying process function, and echodataflow handles them only if explicitly required for pipeline processing.
Revamped Logging:
Logging improvements include better tracking, debugging, and overall transparency. Dask logs now maintain order, log tracebacks before failures, and resolve log contamination across pipeline runs in a Dask cluster setup (resolves #102).
Improved Post-Execution Cleanup:
Enhanced cleanup logic to manage and clean up resources after execution, improving reliability and efficiency.
save_offline
flag inpipeline.yaml
prioritizes whether to keep intermediate files, while theretention
flag acts as a default ifsave_offline
is absent.Exception Handling:
Added support for internal and external exceptions, clearly distinguishing between soft and hard failures.
Resolves #100