USGS-R / drb-estuary-salinity-ml

Creative Commons Zero v1.0 Universal
0 stars 4 forks source link

begin to integrate it analysis data prep into pipeline #35

Closed galengorski closed 2 years ago

galengorski commented 2 years ago

This is my first attempt at organizing the IT analysis functions to get them integrated in a pipeline framework, so we might have to iterate to find a good solution here. The way I was thinking of setting it up is with it_analysis_data_prep.py script with input from the it_analysis_data_prep_config.yaml file where:

  1. sources and sinks are selected based on column names
  2. the number of lag days is selected
  3. those sink and lagged sources are written to 03_it_analysis/src/out file as a pickle file

Those lagged sources and sinks are in lists (one item for each site/model run) and they are to be read in by the next set of functions that will include calculating the correlations and plotting the heat maps. I think the way I'll initially set it up will be to use the sinks and sources for correlation, mutual information, and transfer entropy each with their own config.yaml file, and have each function output a similar matrix to be used for heatmap plotting.