mintproject / MINT-Transformation

MIT License
3 stars 1 forks source link

More friendly version of config files #83

Open minhptx opened 4 years ago

minhptx commented 4 years ago

@raunaqtri1 @binh-vu

We need to integrate our transformation pipelines into [DAME] (http://mint-project.info/products/model/#desktop-appliation-for-model-execution-dame). So we need a more user-friendly version of our YAML so that DAME can show to the users.

Can we add some new fields to the YAML file ?

  1. description: should be the description for the whole pipeline
  2. description for each input of the pipeline since we have the input section now

I'll update the list after I get more requests from them.

raunaqtri1 commented 4 years ago

@minhptx, there is already a description key for the whole pipeline. It is optional and that's why never used. For example, you can have a YAML config like this:

version: "1"
description: Transform ISRIC soil files to rts/rti files that can be used as input for Topoflow. Perform cropping, regridding and resampling using GDAL.
adapters:
  tf_soil:
    comment: My topoflow soil write adapter
    adapter: funcs.Topoflow4SoilWriteFunc
    inputs:
      layer: 1
      input_dir: "/tmp/demo/input/isric_soil"
      output_dir: "/tmp/demo/output/isric_soil"
      DEM_bounds: "34.221249999999, 7.362083333332, 36.446249999999, 9.503749999999"
      DEM_xres_arcsecs: "30"
      DEM_yres_arcsecs: "30"
minhptx commented 4 years ago

@raunaqtri1 Thanks. I will update the existing pipelines to include this. How about the second bullet ?(just updated)

mosoriob commented 4 years ago

Can you add a field dataset? A link where the user can find the input_files For example, the topoflow climate transformation:

description: ""Data transformation to generate TopoFlow-ready precipitation files (RTS) from Global Precipitation Measurement (GPM) data sources" 
datasets: https://data-catalog.mint.isi.edu/datasets/ea0e86f3-9470-4e7e-a581-df85b4a7075d
mosoriob commented 4 years ago

Right now, DAME has only one transformation. If you add the description and the datasets, I can add more. PS: We're late because the deadline of this integration was yesterday.

binh-vu commented 4 years ago

@sirspock There can be multiple datasets in the pipeline, if you want to list all datasets in the pipeline, you can iterate through the DcatReadFunc in the adapters, and get the dataset_id there.