OSOceanAcoustics / echodataflow

Orchestrated sonar data processing workflow
https://echodataflow.readthedocs.io/en/latest/
MIT License
4 stars 1 forks source link

Config specifications #3

Closed lsetiawan closed 1 year ago

lsetiawan commented 1 year ago

Overview

In order to be able to pass in specifications for the underlying processing after stage 0, there needs to be a mechanism to do so. https://github.com/OSOceanAcoustics/echopype/pull/817 has the proposed subpackages, so below are the proposed specifications that can happen:

Spec

name: (str) Name of the full pipeline must be unique
sonar_model: (str) The sonar model, can only be the available ones in echopype
raw_regex: (str) The regular expression of the pattern for raw files
args: # input arguments for raw files
  urlpath: (str) urlpath to the input raw file... can be a jinja template where the values retrieved form parameters below
  parameters: {} # (dict) Set default parameter values as found in urlpath
  storage_options: {} # (dict) fsspec filesystem storage options for the source
  transect: # **Optional** field, if exists it indicates that converted files should be organized by transect (yyyy/transect)
    file: (str) urlpath to transect files
    storage_options: {} # (dict) fsspec filesystem storage options for the transect files
output:
  urlpath: (str) The urlpath to the output converted raw file
  storage_options: {} # (dict) fsspec filesystem storage options for the output where files will be stored
  overwrite: (bool) Flag to allow for overwriting or not

# Below are echopype specifications, if not provided will be using defaults
echopype:
  consolidate:
    add_location: {}
    add_splitbeam_angle: {}
  calibrate:
    compute_Sv:
      env_params: {}
      cal_params: {}
      waveform_mode: {}
    compute_TS: {}
  filter:
    median: {}
    conv: {}
    remove_noise: {}
    noise:
      estimate_noise: {}
      mean_bkg: {}
      spike: {}
  unify:
    compute_MVBS: {}
    compute_MVBS_index_binning: {}
    regrid_Sv: {}
  mask:
    from_Sv:
      freq_diff: {}
    from_labels:
      boundary: {}
      region: {}
  metrics:
    summary statistics: {}
    compute_NASC: {}
Sohambutala commented 1 year ago

Closing this, since this is linked with [Separating Dataset and Pipeline configuration yaml files] (https://github.com/OSOceanAcoustics/echoflow/issues/11)