ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
305 stars 309 forks source link

[CLM-only] Restart files do NOT have _FillValue/missing_value attributes on fields #88

Open ekluzek opened 6 years ago

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-08-24 14:02:01 -0600 Bugzilla Id: 1401 Bugzilla CC: dlawren, oleson, rfisher, sacks, thoar,

Tim Hoar noticed that the restart files don't have _FillValue/missing_value attributes on fields as they should. The history files do, but restart files do NOT. Also we noticed that T_SOISNO has some values set to 0.0 while others are set to 1.e+36. Both of which are really missing values, but the code shouldn't set them to two different values.

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-08-24 14:12:36 -0600

mkarbinit sets t_soisno to spval and then defines the non-snow layers to something for everything besides lake.

Hydrology2Mod then has the following...

! Set empty snow layers to zero

do j = -nlevsno+1,0
   do fc = 1, num_snowc
      c = filter_snowc(fc)
      if (j <= snl(c) .and. snl(c) > -nlevsno) then
         h2osoi_ice(c,j) = 0._r8
         h2osoi_liq(c,j) = 0._r8
         t_soisno(c,j) = 0._r8
         dz(c,j) = 0._r8
         z(c,j) = 0._r8
         zi(c,j-1) = 0._r8
      end if
   end do
end do

But, for t_soisno(c,j) it should set it to spval rather than zero.

ekluzek commented 6 years ago

Tim Hoar < thoar > - 2011-08-24 14:28:23 -0600

There may be more to it ... I'm looking at values for H2OSOI_LIQ and sometimes the top 5 layers are zero, sometimes spval ... H2OSOI_ICE has the same behavior:

0, // H2OSOI_LIQ(1,1602) 0, // H2OSOI_LIQ(2,1602) 0, // H2OSOI_LIQ(3,1602) 0, // H2OSOI_LIQ(4,1602) 0, // H2OSOI_LIQ(5,1602) 1, // H2OSOI_LIQ(6,1602) 0, // H2OSOI_LIQ(7,1602) 0, // H2OSOI_LIQ(8,1602) 0, // H2OSOI_LIQ(9,1602) 0, // H2OSOI_LIQ(10,1602) 0, // H2OSOI_LIQ(11,1602) 0, // H2OSOI_LIQ(12,1602) 0, // H2OSOI_LIQ(13,1602) 0, // H2OSOI_LIQ(14,1602) 0, // H2OSOI_LIQ(15,1602) 0, // H2OSOI_LIQ(16,1602) 0, // H2OSOI_LIQ(17,1602) 0, // H2OSOI_LIQ(18,1602) 0, // H2OSOI_LIQ(19,1602) 0, // H2OSOI_LIQ(20,1602) 1e+36, // H2OSOI_LIQ(1,1603) 1e+36, // H2OSOI_LIQ(2,1603) 1e+36, // H2OSOI_LIQ(3,1603) 1e+36, // H2OSOI_LIQ(4,1603) 1e+36, // H2OSOI_LIQ(5,1603) 0, // H2OSOI_LIQ(6,1603) 0, // H2OSOI_LIQ(7,1603) 0, // H2OSOI_LIQ(8,1603) 0, // H2OSOI_LIQ(9,1603) 0, // H2OSOI_LIQ(10,1603) 0, // H2OSOI_LIQ(11,1603) 0, // H2OSOI_LIQ(12,1603) 0, // H2OSOI_LIQ(13,1603) 0, // H2OSOI_LIQ(14,1603) 0, // H2OSOI_LIQ(15,1603) 0, // H2OSOI_LIQ(16,1603) 0, // H2OSOI_LIQ(17,1603) 0, // H2OSOI_LIQ(18,1603) 0, // H2OSOI_LIQ(19,1603) 0, // H2OSOI_LIQ(20,1603) 1e+36, // H2OSOI_LIQ(1,1604) 1e+36, // H2OSOI_LIQ(2,1604) 1e+36, // H2OSOI_LIQ(3,1604) 1e+36, // H2OSOI_LIQ(4,1604) 1e+36, // H2OSOI_LIQ(5,1604)

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-08-24 14:47:41 -0600

(In reply to comment #2)

There may be more to it ... I'm looking at values for H2OSOI_LIQ and sometimes the top 5 layers are zero, sometimes spval ... H2OSOI_ICE has the same behavior:

Actually, that's a good point. T_SOISNO physically can't be zero, so it should be set to spval, but the others can be zero so I thought they should be left alone. But, really the zero is being used to mark them as missing values NOT to a literal value of zero. The other values are also initialized to spval, and then set to something in mkarbinit. So here (in Hydrology2) they should all be set to spval and _FillValue should equal spval on the file.

I also found several places where 1.e36 was hardcoded in rather than using spval...

biogeochem/DryDepVelocity.F90: crs = 1.e36_r8 biogeochem/DryDepVelocity.F90: rgsx(ispec) = 1.e36_r8 biogeochem/DryDepVelocity.F90: rclx(ispec)=1.e36_r8 biogeochem/DryDepVelocity.F90: rsmx(ispec)=1.e36_r8 biogeochem/DryDepVelocity.F90: rlux(ispec)=1.e36_r8 main/clm_varcon.F90: real(r8), public, parameter :: spval = 1.e36_r8 ! special value for real data main/domainMod.F90: domain%frac = -1.0e36 main/domainMod.F90: domain%ntop = -1.0e36 main/downscaleMod.F90: !??? MV: Note that frac_l is not set yet - so it is -1.0e36 - so first if will never be main/mkarbinitMod.F90: dz(c,-nlevsno+1: 0) = 1.e36_r8 main/mkarbinitMod.F90: z (c,-nlevsno+1: 0) = 1.e36_r8 main/mkarbinitMod.F90: zi(c,-nlevsno :-1) = 1.e36_r8

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-08-25 13:34:38 -0600

Tim also notes some values where snow isn't where T_SOISNO is set to a negative value. Presumably, this is because it's being set to zero, and then there is an operation over it that should be checking SNLSNO that isn't...

Keith is going to try to find out where this is happening.

On Aug 25, 2011, at 11:06 AM, Keith Oleson wrote:

I checked SNLSNO in this file and it is zero for columns 6999-7004, so these values aren't being used. But I'm not sure why they would be anything other than zero or spval.

Keith

On 8/25/2011 10:41 AM, Erik Kluzek wrote:

Keith, Dave and Sam

Tim is finding negative values for T_SOISNO on his 1-degree restart file. It doesn't look like it's something masked out. I'm not sure what to think about this as obviously this is unphysical. It also seems to be for a vegetated land-unit so it has no excuse. I'm guessing this might be something going for snow that was previously masked out and maybe it corrects the next time step?

---------------------------- Original Message

Subject: Re: CLM t_soisno values in restart files ... not possible From: "Tim Hoar"<thoar@ucar.edu mailto:thoar@ucar.edu> Date: Wed, August 24, 2011 4:42 pm To: "Erik Kluzek"<erik@ucar.edu mailto:erik@ucar.edu>

/glade/scratch/yfzhang/clmcases/enstest_0818/run/enstest_0818.clm2_0006.r.2000-01-06-00000.nc

1, // cols1d_ityplun(6999) 1, // cols1d_ityplun(7000) 1, // cols1d_ityplun(7001) 1, // cols1d_ityplun(7002) 1, // cols1d_ityplun(7003) 1, // cols1d_ityplun(7004)

I believe this is a 1x1 degree case:

:Initial_conditions_dataset = "clm_I1850.clm2.r.1853-01-01-00000.nc" ;

Fei is starting with "1850"-like conditions and we were going to run a climatological case.

Thanks - sorry to be a bother -- Tim

On Aug 24, 2011, at 4:00 PM, Erik Kluzek wrote:

Tim

Can you check what land unit type columns 7000 and 7001 are? Also what resolution is this for?

On Wed, August 24, 2011 3:45 pm, Tim Hoar wrote:

-0.0218330867655131, // T_SOISNO(3,7000) -0.0220687418285371, // T_SOISNO(4,7000) -0.0457113421978009, // T_SOISNO(5,7000) 264.179362603644, // T_SOISNO(6,7000) 264.909522035041, // T_SOISNO(7,7000) 265.970675455338, // T_SOISNO(8,7000) 267.367414331807, // T_SOISNO(9,7000) 269.072014205821, // T_SOISNO(10,7000) 270.748952573642, // T_SOISNO(11,7000) 272.27801859387, // T_SOISNO(12,7000) 273.521353894311, // T_SOISNO(13,7000) 274.905387882293, // T_SOISNO(14,7000) 276.33954894197, // T_SOISNO(15,7000) 277.15213189024, // T_SOISNO(16,7000) 277.457490832681, // T_SOISNO(17,7000) 276.999992954815, // T_SOISNO(18,7000) 276.661468195319, // T_SOISNO(19,7000) 276.65198613183, // T_SOISNO(20,7000) 0, // T_SOISNO(1,7001) 0, // T_SOISNO(2,7001) 0, // T_SOISNO(3,7001) -0.0694586765806962, // T_SOISNO(4,7001) -0.123480472298186, // T_SOISNO(5,7001)

The units are degrees kelvin ... negative values are not good.

ekluzek commented 6 years ago

Keith Oleson < oleson > - 2011-08-25 16:44:35 -0600

I was able to replicate YongFei's case bfb except that I did not get negative values for T_SOISNO in the restart files. YongFei now reports that her negative values are probably due to the assimilation procedure.

Regardless, I think that the variables mentioned by Erik below could probably be set to spval instead of zero. I did a quick test where I set them to spval in Hydrology2 and got bfb answers. This is very limited testing of course. Zeros can still appear in the restart file however if the initial file used for the simulation has zeros and the snow layers don't change state during the simulation.

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-09-21 12:12:20 -0600

I tried setting fill_value=nan for several fields on the restart file that are initialized to nan. This worked except it blew up in DEBUG mode. So I took it back out. So either fields that will be on the restart file need to be initialized to spval, or I need to convert them from nan to spval when writing and from spval to nan when reading. Consistently initializing fields to nan and only changing them to spval for I/O would be a good thing.

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-09-27 12:04:32 -0600

Fields that Tim would like us to especially pay attention to...

Erik, Dave;

As soon as these variables have a _FillValue (?and a missing_value?) attribute, I'm happy to check the range of values to see if there are (perhaps) unanticipated uses of other fill values. I'll be working on Matlab-based diagnostic scripts, so I will be doing this in parallel to Erik's effort.

To all on the 'cc' list;

the CLM restart files are easier to ingest into DART if they have a _FillValue specified. If you think you want particular variables to be part of the DART state vector - speak up now. [Andy, Bill - a' priori knowledge of the valid_range might be enough. I don't think we're going to get CLM to add _FillValue and (perhaps) valid_range attributes to the 5kajillion carbon species.]

Here is my list of 11 CLM variables that could use netcdf attributes of '_FillValue'. Yongfei/Bill/Andy/Ave -- please voice your opinions/add to the high-priority list below:

    double frac_sno(column) ;
            frac_sno:long_name = "fraction of ground covered by snow (0 to 1)" ;
            frac_sno:units =  ;
    double DZSNO(column, levsno) ;
            DZSNO:long_name = "snow layer thickness" ;
            DZSNO:units = "m" ;
    double ZSNO(column, levsno) ;
            ZSNO:long_name = "snow layer depth" ;
            ZSNO:units = "m" ;
    double ZISNO(column, levsno) ;
            ZISNO:long_name = "snow interface depth" ;
            ZISNO:units = "m" ;

    double T_SOISNO(column, levtot) ;
            T_SOISNO:long_name = "soil-snow temperature" ;
            T_SOISNO:units = "K" ;
    double T_LAKE(column, levlak) ;
            T_LAKE:long_name = "lake temperature" ;
            T_LAKE:units = "K" ;

    double H2OSNO(column) ;
            H2OSNO:long_name = "snow water" ;
            H2OSNO:units = "kg/m2" ;
    double H2OSOI_LIQ(column, levtot) ;
            H2OSOI_LIQ:long_name = "liquid water" ;
            H2OSOI_LIQ:units = "kg/m2" ;
    double H2OSOI_ICE(column, levtot) ;
            H2OSOI_ICE:long_name = "ice lens" ;
            H2OSOI_ICE:units = "kg/m2" ;
    double T_GRND(column) ;
            T_GRND:long_name = "ground temperature" ;
            T_GRND:units = "K" ;

    double T_REF2M(pft) ;
            T_REF2M:long_name = "2m height surface air temperature" ;
            T_REF2M:units = "K" ;

Tim Hoar, Associate Scientist National Center for Atmospheric Research thoar@ucar.edu 303 497 1708

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-09-27 12:05:28 -0600

This was marked as fixed and shouldn't have been...

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2011-09-27 14:59:16 -0600

Of the fields in the list only: T_SOISNO, H2OSOI_LIQ/ICE are set to spval everything else is set to nan.

ekluzek commented 6 years ago

Tim Hoar < thoar > - 2011-09-28 10:55:00 -0600

(In reply to comment #9)

Of the fields in the list only: T_SOISNO, H2OSOI_LIQ/ICE are set to spval everything else is set to nan.

OK then ... I need to be able to replace nan with my DART missing code, ... do what I need to do ... and then replace the DART missing code with nan to keep things coordinated. How do I read/replace nan values?

ekluzek commented 6 years ago

Erik Kluzek < erik > - 2012-01-23 10:01:40 -0700

I'm starting to work on this in clm4_0_38. I checked in code to handle it and it works on bluefire, but fails with intel. So I'm going to back it out until I can get a solution that works on all platforms.

I am leaving in an option to ncdio_pio

cnvrtnan2fill

that will convert any spval read to nan, and convert any nan on write to spval so that FillValue functions correctly.

Here's the code changes to get back to the previous version. Without the changes below, the convrtnan2fill option is never used.

[yong:lnd/clm/src] erik% svn diff
Index: main/accumulMod.F90
===================================================================
--- main/accumulMod.F90 (revision 33981)
+++ main/accumulMod.F90 (working copy)
@@ -27,7 +27,6 @@
   use abortutils  , only: endrun
   use clm_varctl  , only: iulog
   use nanMod      , only: bigint
-  use shr_infnan_mod, only: shr_infnan_isnan
 !
 ! !PUBLIC TYPES:
   implicit none
@@ -425,7 +424,7 @@
 ! !IROUTINE: update_accum_field_sl
 !
 ! !INTERFACE:
-  subroutine update_accum_field_sl (name, field, nstep, filternan)
+  subroutine update_accum_field_sl (name, field, nstep)
 !
 ! !DESCRIPTION:
 ! Accumulate single level field over specified time interval.
@@ -436,7 +435,6 @@
     character(len=*), intent(in) :: name     !field name
     real(r8), pointer, dimension(:) :: field !field values for current time step
     integer , intent(in) :: nstep            !time step index
-    logical , intent(in), optional :: filternan ! filter out any NaN's in the input field
 !
 ! !REVISION HISTORY:
 ! Created by Sam Levis
@@ -449,8 +447,6 @@
     integer :: i,k,nf              !indices
     integer :: accper              !temporary accumulation period
     integer :: beg,end             !subgrid beginning,ending indices
-    integer :: filter(sizeof(field)) !filter of 
-    integer :: nfilter             ! number of points to filter
 !------------------------------------------------------------------------

     ! find field index. return if "name" is not on list
@@ -475,23 +471,6 @@
        call endrun
     endif

-    ! Get filter
-    if ( filternan ) then
-       nfilter = 0
-       do i = beg, end
-          if ( .not. shr_infnan_isnan( field(i) ) )then
-             nfilter         = nfilter + 1
-             filter(nfilter) = i
-          end if
-       end do
-    else
-       nfilter = 0
-       do i = beg, end
-         nfilter         = nfilter + 1
-         filter(nfilter) = i
-       end do
-    end if
-
     ! accumulate field

     if (accum(nf)%acctype /= 'timeavg' .AND. &
@@ -515,11 +494,11 @@
        !normalize at end of accumulation period

        if ((mod(nstep,accum(nf)%period) == 1) .and. (nstep /= 0)) then
-          accum(nf)%val(filter(:nfilter),1) = 0._r8
+          accum(nf)%val(beg:end,1) = 0._r8
        end if
-       accum(nf)%val(filter(:nfilter),1) =  accum(nf)%val(filter(:nfilter),1) + field(filter(:nfilter))
+       accum(nf)%val(beg:end,1) =  accum(nf)%val(beg:end,1) + field(beg:end)
        if (mod(nstep,accum(nf)%period) == 0) then
-          accum(nf)%val(filter(:nfilter),1) = accum(nf)%val(filter(:nfilter),1) / accum(nf)%period
+          accum(nf)%val(beg:end,1) = accum(nf)%val(beg:end,1) / accum(nf)%period
        endif

     else if (accum(nf)%acctype == 'runmean') then
@@ -527,19 +506,18 @@
        !running mean - reset accumulation period until greater than nstep

        accper = min (nstep,accum(nf)%period)
-       accum(nf)%val(filter(:nfilter),1) = ((accper-1)*accum(nf)%val(filter(:nfilter),1) + field(filter(:nfilter))) / accper
+       accum(nf)%val(beg:end,1) = ((accper-1)*accum(nf)%val(beg:end,1) + field(beg:end)) / accper

     else if (accum(nf)%acctype == 'runaccum') then

        !running accumulation field reset at trigger -99999

-       do i = 1, nfilter
-          k = filter(i)
+       do k = beg,end
           if (nint(field(k)) == -99999) then
              accum(nf)%val(k,1) = 0._r8
           end if
        end do
-       accum(nf)%val(filter(:nfilter),1) = min(max(accum(nf)%val(filter(:nfilter),1) + field(filter(:nfilter)), 0._r8), 99999._r8)
+       accum(nf)%val(beg:end,1) = min(max(accum(nf)%val(beg:end,1) + field(beg:end), 0._r8), 99999._r8)

     end if

Index: main/accFldsMod.F90
===================================================================
--- main/accFldsMod.F90 (revision 33981)
+++ main/accFldsMod.F90 (working copy)
@@ -578,9 +578,9 @@
     do p = begp,endp
        rbufslp(p) = fsun(p)
     end do
-    call update_accum_field  ('FSUN24', rbufslp, nstep,  filternan=.true.)
-    call extract_accum_field ('FSUN24', fsun24, nstep )
-    call update_accum_field  ('FSUN240', rbufslp, nstep, filternan=.true.)
+    call update_accum_field  ('FSUN24', rbufslp, nstep)
+    call extract_accum_field ('FSUN24', fsun24, nstep)
+    call update_accum_field  ('FSUN240', rbufslp, nstep)
     call extract_accum_field ('FSUN240', fsun240, nstep)

     ! Accumulate and extract elai_p (heald 04/06)
Index: biogeophys/BiogeophysRestMod.F90
===================================================================
--- biogeophys/BiogeophysRestMod.F90    (revision 33981)
+++ biogeophys/BiogeophysRestMod.F90    (working copy)
@@ -59,6 +59,7 @@
     use initSurfAlbMod  , only : do_initsurfalb
     use clm_time_manager, only : is_first_step
     use SNICARMod       , only : snw_rds_min
+    use shr_infnan_mod  , only : shr_infnan_isnan
     use mkarbinitMod    , only : perturbIC
     use clm_time_manager, only : is_restart
 !
@@ -345,12 +346,12 @@

     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='frac_sno', xtype=ncd_double,  &
-            dim1name='column', fill_value=spval, &
+            dim1name='column',&
             long_name='fraction of ground covered by snow (0 to 1)',units='unitless')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='frac_sno', data=cptr%cps%frac_sno, &
             dim1name=namec, &
-            ncid=ncid, flag=flag, readvar=readvar, cnvrtnan2fill=.true.) 
+            ncid=ncid, flag=flag, readvar=readvar) 
        if (flag == 'read' .and. .not. readvar) then
           if (is_restart()) call endrun()
        end if
@@ -361,14 +362,14 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='DZSNO', xtype=ncd_double,  &
             dim1name='column', dim2name='levsno', switchdim=.true., &
-            long_name='snow layer thickness', units='m', fill_value=spval)
+            long_name='snow layer thickness', units='m')
     else if (flag == 'read' .or. flag == 'write') then
        allocate(temp2d(begc:endc,-nlevsno+1:0))
        if (flag == 'write') then 
           temp2d(begc:endc,-nlevsno+1:0) = cptr%cps%dz(begc:endc,-nlevsno+1:0)
        end if
        call ncd_io(varname='DZSNO', data=temp2d, &
-            dim1name=namec, switchdim=.true., cnvrtnan2fill=.true., &
+            dim1name=namec, switchdim=.true., &
             lowerb2=-nlevsno+1, upperb2=0, ncid=ncid, flag=flag, readvar=readvar)
        if (flag == 'read' .and. .not. readvar) then
           if (is_restart()) call endrun()
@@ -384,14 +385,14 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='ZSNO', xtype=ncd_double,  &
             dim1name='column', dim2name='levsno', switchdim=.true., &
-            long_name='snow layer depth', units='m', fill_value=spval)
+            long_name='snow layer depth', units='m')
     else if (flag == 'read' .or. flag == 'write') then
        allocate(temp2d(begc:endc,-nlevsno+1:0))
        if (flag == 'write') then 
           temp2d(begc:endc,-nlevsno+1:0) = cptr%cps%z(begc:endc,-nlevsno+1:0)
        end if
        call ncd_io(varname='ZSNO', data=temp2d, &
-            dim1name=namec, switchdim=.true., cnvrtnan2fill=.true., &
+            dim1name=namec, switchdim=.true., &
             lowerb2=-nlevsno+1, upperb2=0, ncid=ncid, flag=flag, readvar=readvar)
        if (flag == 'read' .and. .not. readvar) then
           if (is_restart()) call endrun()
@@ -407,14 +408,14 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='ZISNO', xtype=ncd_double,  &
             dim1name='column', dim2name='levsno', switchdim=.true., &
-            long_name='snow interface depth', units='m', fill_value=spval)
+            long_name='snow interface depth', units='m')
     else if (flag == 'read' .or. flag == 'write') then
        allocate(temp2d(begc:endc,-nlevsno:-1))
        if (flag == 'write') then 
           temp2d(begc:endc,-nlevsno:-1) = cptr%cps%zi(begc:endc,-nlevsno:-1)
        end if
        call ncd_io(varname='ZISNO', data=temp2d, &
-            dim1name=namec, switchdim=.true., cnvrtnan2fill=.true., &
+            dim1name=namec, switchdim=.true., &
             lowerb2=-nlevsno, upperb2=-1, ncid=ncid, flag=flag, readvar=readvar)
        if (flag == 'read' .and. .not. readvar) then
           if (is_restart()) call endrun()
@@ -928,13 +929,13 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='H2OSOI_LIQ', xtype=ncd_double,  &
             dim1name='column', dim2name='levtot', switchdim=.true., &
-            long_name='liquid water', units='kg/m2', fill_value=spval)
+            long_name='liquid water', units='kg/m2')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='H2OSOI_LIQ', data=cptr%cws%h2osoi_liq, &
-           dim1name=namec, switchdim=.true., ncid=ncid, flag=flag, readvar=readvar)
-      if (flag=='read' .and. .not. readvar) then
-         if (is_restart()) call endrun()
-      end if
+            dim1name=namec, switchdim=.true., ncid=ncid, flag=flag, readvar=readvar)
+       if (flag=='read' .and. .not. readvar) then
+          if (is_restart()) call endrun()
+       end if
     end if

     ! column water state variable - h2osoi_ice
@@ -942,8 +943,7 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='H2OSOI_ICE', xtype=ncd_double,   &
             dim1name='column', dim2name='levtot', switchdim=.true., &
-            long_name='ice lens', units='kg/m2', fill_value=spval)
-
+            long_name='ice lens', units='kg/m2')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='H2OSOI_ICE', data=cptr%cws%h2osoi_ice, &
             dim1name=namec, switchdim=.true., ncid=ncid, flag=flag, readvar=readvar)
@@ -956,7 +956,7 @@

     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='T_GRND', xtype=ncd_double,  &
-            dim1name='column', fill_value=spval, &
+            dim1name='column', &
             long_name='ground temperature', units='K')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='T_GRND', data=cptr%ces%t_grnd, &
@@ -1242,7 +1242,7 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='T_SOISNO', xtype=ncd_double,   &
             dim1name='column', dim2name='levtot', switchdim=.true., &
-            long_name='soil-snow temperature', units='K', fill_value=spval)
+            long_name='soil-snow temperature', units='K')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='T_SOISNO', data=cptr%ces%t_soisno, &
             dim1name=namec, switchdim=.true., ncid=ncid, flag=flag, readvar=readvar)
@@ -1256,7 +1256,7 @@
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='T_LAKE', xtype=ncd_double,  &
             dim1name='column', dim2name='levlak', switchdim=.true., &
-            long_name='lake temperature', units='K', fill_value=spval)
+            long_name='lake temperature', units='K')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='T_LAKE', data=cptr%ces%t_lake, &
             dim1name=namec, switchdim=.true., ncid=ncid, flag=flag, readvar=readvar)
@@ -1372,15 +1372,21 @@

     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname='fsun', xtype=ncd_double,  &
-            dim1name='pft', fill_value=spval, &
+            dim1name='pft', &
             long_name='sunlit fraction of canopy', units='')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname='fsun', data=pptr%pps%fsun, &
             dim1name=namep, &
-            ncid=ncid, flag=flag, readvar=readvar, cnvrtnan2fill=.true. )
+            ncid=ncid, flag=flag, readvar=readvar)
        if (flag=='read' )then
           if ( .not. readvar) then
              if (is_restart()) call endrun()
+          else
+             do p = begp, endp
+                if ( shr_infnan_isnan( pptr%pps%fsun(p) ) )then
+                   pptr%pps%fsun(p) = spval
+                end if
+             end do
           end if
        end if
     end if
@@ -1505,7 +1511,7 @@
     varname = 'T_REF2M'
     if (flag == 'define') then
        call ncd_defvar(ncid=ncid, varname=varname, xtype=ncd_double,  &
-            dim1name='pft', fill_value=spval, &
+            dim1name='pft', &
             long_name='2m height surface air temperature', units='K')
     else if (flag == 'read' .or. flag == 'write') then
        call ncd_io(varname=varname, data=pptr%pes%t_ref2m, &
ekluzek commented 6 years ago

Erik Kluzek < erik > - 2012-01-23 10:16:03 -0700

Here's the error on mirage...

(seq_mct_drv) : Model initialization complete

forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source
clm 00000000009B3391 accumulmod_mp_upd 479 accumulMod.F90 clm 00000000009A5F4A accfldsmod_mp_upd 473 accFldsMod.F90 clm 000000000053FBE0 clm_driver_mp_clm 756 clm_driver.F90 clm 00000000004F1EDF lnd_comp_mct_mp_l 698 lnd_comp_mct.F90 clm 0000000000422B00 ccsm_comp_modmp 2521 ccsm_comp_mod.F90 clm 000000000042F078 MAIN__ 91 ccsm_driver.F90 clm 0000000000403A82 Unknown Unknown Unknown libc.so.6 0000003DB781D994 Unknown Unknown Unknown clm 00000000004039A9 Unknown Unknown Unknown

This is the update of TREVAV, in assessing accum(nf)%acctype...

if (accum(nf)%acctype /= 'timeavg' .AND. &
    accum(nf)%acctype /= 'runmean' .AND. &
    accum(nf)%acctype /= 'runaccum') then
   write(iulog,*) 'UPDATE_ACCUM_FIELD_SL error: incorrect accumulation type'
   write(iulog,*) ' was specified for field ',name
   write(iulog,*)' accumulation type specified is ',accum(nf)%acctype
   write(iulog,*)' only [timeavg, runmean, runaccum] are currently acceptable'
   call endrun()
end if
slevis-lmwg commented 3 years ago

I was about to open a new issue with title T_SOISNO=0 in finidat gets handled by init_interp as valid T_SOI but found this relevant discussion.

Here's a summary of the issue that I was about to post: My WRF-CTSM(CLMSP) simulation shows a few dozen TSOI near absolute zero in the first few model hours. (These appear to follow lake shorelines.) The finidat file in this simulation contains a combination of valid T_SOISNO values and zeros and NaNs, the zeros and NaNs presumably representing missing data. A simple replacement of all T_SOISNO=0K with T_SOISNO=283K or T_SOISNO=NaN or T_SOISNO=1e36 in finidat results in the near-0K TSOI values now appearing with reasonable values. The solution that I'm considering is for init_interp to treat T_SOISNO = 0K the same as NaNs or 1e36, i.e. as missing data.

Suggestions? Ideas? Comments?

billsacks commented 3 years ago

@slevisconsulting Thank you for your continued careful work on this, which is leading to the discovery of potential issues.

I can't tell from your comment: are the 0 values actually causing problems, or are they possibly in unused points (inactive columns or non-existent soil / snow layers)?

Before deciding on a solution to the problem, I think it would help to better understand the cause. Is this happening in snow layers or soil layers? Over what landunit/column type(s)? And what aspect of the interpolation – horizontal or vertical – is creating these 0 values?

It looks like t_soisno is initialized to spval everywhere in this code:

https://github.com/ESCOMP/CTSM/blob/ff1ae2c23e4b4ead376d41ab950eaca28c91ba7c/src/biogeophys/TemperatureType.F90#L685-L688

The only place where I see an explicit zeroing of t_soisno is here:

https://github.com/ESCOMP/CTSM/blob/ff1ae2c23e4b4ead376d41ab950eaca28c91ba7c/src/biogeophys/SnowHydrologyMod.F90#L2892

So is it just that these zeroed snow layers get copied into the output file? If so, have you checked whether the 0 values you're seeing are only in non-existent snow layers (based on snlsno in the given column)? I believe that it is expected that these 0 values would make their way into the interpolated finidat file, but they should only appear in non-existent snow layers. If you're seeing something different, we should talk more.

If this does turn out to be a problem, then putting variable-specific logic in init_interp should be a last resort solution: Mariana and I worked hard to remove this variable-specific logic from init_interp a few years ago because it was hard to maintain and led to bugs. One solution might be to find where t_soisno is set to 0 in the code (maybe just the one above line) and set it to spval there instead, if that would solve the problem.

slevis-lmwg commented 3 years ago

The near-0K TSOI affects the calculated FSH, so I think we have an actual problem:

FSH_day2

The near-0K TSOI values appear down to the deepest soil layer before we get into the bedrock. Line 2892 that you referenced is in the loop do j = -nlevsno+1,0 so it shouldn't affect the soil.

Replacing all T_SOISNO=0 with 283 or NaN or 1e36 directly in the finidat file fixed the problem, so I think that init_interp receives 0K and treats it as a valid value. To avoid variable-specific logic, how about unit-specific logic that applies to degrees K?

slevis-lmwg commented 3 years ago

@billsacks' comment during this morning's mtg suggests to me that the TSOISNO=0 values end up in the WRF-CTSM(CLMSP) finidat file after the manipulation with the .ncl script (here: /glade/work/slevis/git_wrf/ctsminit) by which we introduce WRF initial conditions to finidat. Looking at the .ncl script tells me that we do not explicitly and intentionally initialize T_SOISNO=0 (except for snow layers), so I suspect that T_SOISNO=0 enters via WRF's TSLB variable.

For now it is clear that I may proceed with WRF-CTSM(CLMSP) simulations once I ensure that T_SOISNO=0 is replaced with NaNs in my finidat. Whether init_interp's handling of 0K should be changed has not been resolved.

billsacks commented 3 years ago

I checked 7 of our out-of-the-box initial conditions files with this python code (depends on numpy and Netcdf4, both available on cheyenne if you load ncar_pylib):

def print_stats(filename):
    dat = Dataset(filename)
    tsoi = dat.variables['T_SOISNO'][:]
    levsno = dat.dimensions['levsno'].size
    print('levsno = {}'.format(levsno))
    print('number of points with tsoi == 0: {}'.format(np.sum(tsoi == 0)))
    print('number of snow points: {}'.format(np.sum(tsoi[:,:levsno] == 0)))
    print('number of soil points: {}'.format(np.sum(tsoi[:,levsno:] == 0)))
    print('min T in soil: {}'.format(np.min(tsoi[:,levsno:])))

All files had a lot of 0 values over snow layers (as expected), but none of them had any 0 values in any soil layers. It would be worth double-checking this for the original source file you started with before merging in the WRF values. As long as you see the same thing, then I think the culprit is the WRF variable, and maybe the solution is to fix this in the ncl script or as a post-processing step on this merged file, rather than changing any CTSM code to handle this scenario.

If you do add any CTSM code, what I'd suggest is adding an error check in the restart routine in TemperatureType that ensures that there are no super-cold temperature values after reading the initial conditions file, and aborting if there are. The minimum values on our out-of-the-box files are about 210 K, but probably a check for any values <= 0 would catch any common issues (with 0 values or unit problems). To be clear: I don't feel this is necessary, but I'd be happy with this error checking being added if you feel it would help catch future errors early.

slevis-lmwg commented 3 years ago

Thanks @billsacks . I do not need to modify CTSM code for this. The group can decide whether to add the error check to avoid this pitfall.