Set non-trivial sunlit flag in example data for better test coverage

brhillman commented 4 years ago

Previously the example data that is used for the example cosp2_test driver and for the regression tests contained a trivial sunlit flag that designated all columns as daytime columns. This meant that regression tests were not properly testing fields that should be masked for night columns. This PR modifies the existing input data (cosp_input_um.nc) to set the second half of the columns to night by modifying the sunlit flag via ncap2 -s 'sunlit(0:617) = 0;' cosp_input_um.nc cosp_input_um.nc). In order to include both sunlit and non-sunlit columns, the input namelist is also modified to process all points in the input file (previously it only processed 153 of the 1236 points. Runtime is certainly increased, but the cost is still trivial (1 second total time vs 0.2s total time in my tests on my system). Note that these changes will result in differences from baselines for any field that is affected by sunlit if the number of points processed includes the night points (which is set by default now in the namelist), but bit-for-bit can be confirmed by setting Npoints back to 153. Fixes #35

brhillman commented 4 years ago

@dustinswales changes to add night columns to the input data. Note that I initially chose to set the first half to night, but on second thought I decided to set the second half to night and leave the first half alone. This means that if you run with 153 points (the old default), you can confirm that there are no non-BFB changes in the code, but if you keep the new default of Npoints = 1236 you can see that we are now testing day/night masking because passive simulator outputs have diffs. I.e., if I set Npoints = 1236 in master (and leave it set to 1236 in this branch), and run the regression test, I get the following:

(python2) [bhillma@ghost-login2 driver (add-sunlit-test)]$ ( python ../driver/test_cosp2Imp.py ../driver/data/outputs/UKMO/cosp2_output_um.nc ${reference} )
############################################################################################
Treating relative differences less than 0.0010000000% as insignificant
  cltisccp:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  pctisccp:            34.30 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tauisccp:            34.30 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  albisccp:            34.30 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  boxtauisccp:         29.94 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  boxptopisccp:        29.94 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clisccp:             50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clMISR:               1.04 % of values differ, relative range:   1.00e+00 to  1.00e+00
  misr_meanztop:       50.00 % of values differ, relative range:  -1.00e+00 to  1.00e+00
  misr_cldarea:        50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  cltmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clwmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  climodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clhmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clmmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  cllmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tautmodis:           33.66 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tauwmodis:           26.38 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tauimodis:           15.53 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tautlogmodis:        33.66 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tauwlogmodis:        26.38 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  tauilogmodis:        15.53 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  reffclwmodis:        26.38 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  reffclimodis:        15.53 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  pctmodis:            50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  lwpmodis:            26.38 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  iwpmodis:            15.53 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  clmodis:             50.00 % of values differ, relative range:  -1.00e+00 to -1.00e+00
  modis_Optical_Thickness_vs_ReffICE:      0.74 % of values differ, relative range:   1.00e+00 to  1.00e+00
  modis_Optical_Thickness_vs_ReffLIQ:      1.87 % of values differ, relative range:   1.00e+00 to  1.00e+00
  npdfcld:              0.89 % of values differ, relative range:   1.00e+00 to  1.00e+00
  npdfdrz:              0.57 % of values differ, relative range:   1.00e+00 to  1.00e+00
  npdfrain:             0.65 % of values differ, relative range:   1.00e+00 to  1.00e+00
  ncfodd1:              0.14 % of values differ, relative range:   1.00e+00 to  1.00e+00
  ncfodd2:              0.03 % of values differ, relative range:   1.00e+00 to  1.00e+00
  ncfodd3:              0.00 % of values differ, relative range:   1.00e+00 to  1.00e+00
All other fields match.

but if I set Npoints = 153 in both branches, I get:

(python2) [bhillma@ghost-login2 driver (add-sunlit-test)]$ ( python ../driver/test_cosp2Imp.py ../driver/data/outputs/UKMO/cosp2_output_um.nc ${reference} )
############################################################################################
Treating relative differences less than 0.0010000000% as insignificant
All fields match

I thought this might be somewhat more desirable, just because BFB with previous commits could be confirmed, but if you prefer the alternative it's an easy change.

dustinswales commented 4 years ago

@brhillman I think setting the second half of points to night is a great idea. It illustrates that this change is isolated to the input file. Can you replace the reference output data with your output file when all 1236 points are used? So we have a new input file, a new reference output file, and a modification to the driver to use all of the 1236 points. I like this. @RobertPincus @alejandrobodas Do you have issues with this?

alejandrobodas commented 4 years ago

@brhillman @dustinswales thanks a lot for this, it looks very neat to me. I only have one question: the last 6 outputs (npdfcld to ncfodd3), do they use the sunlit mask or they differ for other reasons?

brhillman commented 4 years ago

@alejandrobodas doing a little digging, it looks like these are joint Cloudsat/MODIS diagnostics, so it would make sense that these are affected by the sunlit mask. Digging a bit further, it looks like the routine that calculates these (cosp_diag_warmrain) takes as input the MODIS liquid and ice water path, cloud optical thickness, effective radii, and cloud fraction. At the point where this routine is called, the MODIS retrievals have been masked for night columns. However, it looks like night columns are mistakenly being set to zero in this routine, because there is no check against R_UNDEF in the routine:

    !! initialize
    cfodd_ntotal(:,:,:,:)  = 0._wp
    wr_occfreq_ntotal(:,:) = 0._wp
    icod(:,:,:) = 0._wp

    do i = 1, Npoints
       !! check by MODIS retrieval
       if ( lwp(i)     .le.  CWP_THRESHOLD  .or.  &
          & liqcot(i)  .le.  COT_THRESHOLD  .or.  &
          & liqreff(i) .lt.  CFODD_BNDRE(1) .or.  &
          & liqreff(i) .gt.  CFODD_BNDRE(4) .or.  &
          & iwp(i)     .gt.  CWP_THRESHOLD  .or.  &       !! exclude ice cloud
          & icecot(i)  .gt.  COT_THRESHOLD  .or.  &       !! exclude ice cloud
          & icereff(i) .gt.  CFODD_BNDRE(1)       ) then  !! exclude ice cloud
          cycle
       else
          !! retrieve the CFODD array based on Reff
          icls = 0
          if (    liqreff(i) .ge. CFODD_BNDRE(1) .and. liqreff(i) .lt. CFODD_BNDRE(2) ) then
             icls = 1
          elseif( liqreff(i) .ge. CFODD_BNDRE(2) .and. liqreff(i) .lt. CFODD_BNDRE(3) ) then
             icls = 2
          elseif( liqreff(i) .ge. CFODD_BNDRE(3) .and. liqreff(i) .le. CFODD_BNDRE(4) ) then
             icls = 3
          endif
       endif

Since the first check is lwp(i) < CWP_THRESHOLD, for a column that is masked we would have lwp(i) = R_UNDEF = -1e30, and since -1e30 < CWP_THRESHOLD, the loop cycles and the diagnostics for index i will be left set to zero. But, it would probably be more appropriate to mask these points when the retrievals are masked. So, I think maybe this is highlighting another bug that deserves another PR to fix?

dustinswales commented 4 years ago

@brhillman Looks like adding .and. .ne. R_UNEF to the conditionals would be a good idea.

@takmichibata Do you agree? Thoughts?

brhillman commented 4 years ago

@dustinswales I think actually initializing to R_UNDEF instead of to 0 might be needed too.

EDIT: actually, that might not work, since wr_occfreq_ntotal is a running sum over subcolumns. I think rather there probably needs to be a check at the top that looks something like

do i = 1, Npoints
   if (lwp(i) == R_UNDEF) then
      wr_occfreq_ntotal(i,:) = R_UNDEF
      cfodd_ntotal(i,:,:,:) = R_UNDEF
   else

alejandrobodas commented 4 years ago

@brhillman thanks for looking into this so quickly. What this pull request is highlighting is that we probably need a more general approach to masking, a strategy that is more robust and consistently used across simulators. A topic for discussion outside this pull request...

dustinswales commented 4 years ago

@alejandrobodas That's a great idea. I will open an issue for this topic.

dustinswales commented 2 years ago

Closing this PR. (We have a global snapshot of the UM available for testing day/night masking).

Will open up issue for masking issue in cosp_diag_warmrain.

CFMIP / COSPv2.0

Set non-trivial sunlit flag in example data for better test coverage #36