ua-snap / cmip6-utils

Pipelines and utilites for working with CMIP6 data
0 stars 1 forks source link

Fix ftc dtype and nodata & add nodata QC #17

Closed Joshdpaul closed 5 months ago

Joshdpaul commented 5 months ago

This PR closes #11 and closes #16 .

There's a few different things I went after here:

First was to fix the assignment of nodata values in the computed indicators. After a marathon huddle with @kyleredilla , we solved this by just moving the .compute() in indicators.py. There was no obvious error here, but the nodata values (-9999 and np.nan) simply were not being assigned as we assumed they were.

We also figured out how to change the ftc indicator datatype from timedelta to integer, just using a basic astype(int).

As far as the QC workflow, there are some minor changes to the nesting of QC tasks to prevent all of the tasks from running if the file does not exist or does not open. This cuts down on redundant error messages.

I also added a new indicators/qc.check_nodata_against_inputs() function that works backwards from each indicator filename and uses lookup tables to build the filepaths of input data. Where there are nodata values in the indicators (-9999 or np.nan, depending on the dtype) there should also be nodata values in the input data. Any discrepancies will print an error to the qc_error.txt. In order to accomplish this, I had to also revise the prefect QC task to accept the input data directory as an argument. (There is a companion branch qc_edit and PR #7 in the Prefect repo.)

TO TEST:

{
  "ssh_username": "jdpaul3",
  "ssh_private_key_path": "/Users/joshpaul/.ssh/id_rsa",
  "branch_name": "fix_dtype_and_nodata",
  "working_directory": "/import/beegfs/CMIP6/jdpaul3/scratch",
  "indicators": "rx1day dw su ftc",
  "models": "CESM2 GFDL-ESM4 TaiESM1",
  "scenarios": "historical ssp126 ssp245 ssp370 ssp585",
  "input_dir": "/import/beegfs/CMIP6/arctic-cmip6/regrid"
}

PS) I think there are going to be conflicts here, because I merged @BobTorgerson 's fix_variable_inputs branch into this one while it was in progress... seemed like a good idea at the time :) Hopefully not too difficult to resolve!