CliMA / ClimateMachine.jl

Climate Machine: an Earth System Model that automatically learns from data
https://clima.github.io/ClimateMachine.jl/latest/
Other
450 stars 78 forks source link

Sample diagnostics we will need #214

Open tapios opened 5 years ago

tapios commented 5 years ago

Both for GCM and LES configurations of the model, we need to be able to accumulate various statistics (time aggregates) at runtime. For each of these, we should save averages over time windows (e.g., months). Here's a small subsample of what we will need, to get an idea about what we need to be able to extract:

Two-dimensional fields (dependent on horizontal coordinates)

Three-dimensional fields (dependent on horizontal and vertical coordinates)

Higher-order statistics (horizontal)

simonbyrne commented 4 years ago

From @kpamnany:

For DYCOMS. Some questions embedded in this tentative understanding of the state variables. Please clarify if my understanding has diverged from the truth! ;)

Q is an MPIStateArray that lives on the GPU. A reference to the initialized Q is returned from init_ode_state(). To gather the data we need for diagnostics, we will add a callback as:

  cbstats = GenericCallbacks.EveryXSimulationSteps(1500) do (init=false)    # gather necessary data   end

Here, Q is visible. We can use code as below to get access to the state on the host:

  host_array = Array ∈ typeof(Q).parameters   h_Q = host_array ? Q.Q : Q

I don't think this is quite right. Q.Q is either an Array or a CuArray, but also includes the ghost elements. I would suggest using Q.realQ which is a view of the non-ghost elements, so:

h_Q = Array(Q.realQ)

should work.

h_Q is a 3-dimensional array. The first dimension, called ijk, ranges over the number of nodes in each element (?). When:

  N = polynomialorder(grid)   Nq = N + 1   Nqk = dimensionality(grid) == 2 ? 1 : Nq

The first dimension is Nq*Nq*Nqk (?)

Yes, that's correct. Note that the convention for iteration is

https://github.com/climate-machine/CLIMA/blob/72d6a1b2e5777a7e33134145f196e8ab2d0a0ae4/src/DGmethods/DGmodel_kernels.jl#L81-L84

The second dimension ranges over the actual state variables, of which there are num_state(bl, DFloat). The first one is \rho. What are the others? How do we reference them by name?

This requires information from the AtmosModel object. flattenednames(vars_state(atmos, Float64)) is probably the simplest option to use for now.

The third dimension ranges over the elements of the grid handled by this rank. This includes all elements in the z plane, and possibly more than one such column of elements. How many?

I think you may need access to the topology (which is stored in the DGModel object). See here: https://github.com/climate-machine/CLIMA/blob/72d6a1b2e5777a7e33134145f196e8ab2d0a0ae4/src/DGmethods/DGmodel.jl#L158-L160

which are iterated as https://github.com/climate-machine/CLIMA/blob/72d6a1b2e5777a7e33134145f196e8ab2d0a0ae4/src/DGmethods/DGmodel_kernels.jl#L742-L743

If we want to sum all the \rhos, we can, on each node, iterate over Q[ijk,1,e] and then perform an Allreduce to get a total sum. However, if we only want the \rhos on a particular xy plane? If we want to gather state variables for a particular geographic grid?

I think you will need to do something like the following:

S = zeros(Nqk, nvertelem)
for eh in 1:nhorzelem
  for ev in 1:nvertelem
    e = ev + (eh - 1) * nvertelem

    for k in 1:Nqk
      for j in 1:Nq
        for i in 1:Nq
          ijk = i + Nq * ((j-1) + Nq * (k-1)) 
          S[k,ev] += h_Q[ijk, 1, e]
        end
      end
    end
  end
end
MPI.Allreduce(...)
kpamnany commented 4 years ago

The third dimension ranges over the elements of the grid handled by this rank. This includes all elements in the z plane, and possibly more than one such column of elements. How many?

I think you may need access to the topology (which is stored in the DGModel object). See here: https://github.com/climate-machine/CLIMA/blob/72d6a1b2e5777a7e33134145f196e8ab2d0a0ae4/src/DGmethods/DGmodel.jl#L158-L160

Aren't there ghost elements as well? Because I see: https://github.com/climate-machine/CLIMA/blob/72d6a1b2e5777a7e33134145f196e8ab2d0a0ae4/src/DGmethods/DGmodel.jl#L29

nrealelem is used as the block size for launching GPU kernels and nowhere else; when iterating over elements, nvertelem is fixed by the topology and nhorzelem is computed as nelem/nvertelem. Can you confirm that this is what we should use?

Edit: from inspection in DYCOMS, nelem == nrealelem == 1900.

Also:

I think you will need to do something like the following:

S = zeros(Nqk, nvertelem)
for eh in 1:nhorzelem
  for ev in 1:nvertelem
    e = ev + (eh - 1) * nvertelem

    for k in 1:Nqk
      for j in 1:Nq
        for i in 1:Nq
          ijk = i + Nq * ((j-1) + Nq * (k-1)) 
          S[k,ev] += h_Q[ijk, 1, e]
        end
      end
    end
  end
end
MPI.Allreduce(...)

Can I not just do the following?

tot = sum(h_Q[:,1,:])
MPI.Allreduce(..., MPI_SUM, ...)

If I want to make this general and more intuitive, I can:

localQ = MArray{Tuple{nstate}, DFloat}(undef)
for e = 1:nelem # or nrealelem?
    for i = 1:Nq*Nq*Nqk
        for s = 1:nstate
            l_Q[s] = h_Q[i, s, e]
        end
    end
    compute_diag(Vars{vars_state(bl, DFloat)}(l_Q)
end

Very roughly. And then, compute_diag() can work on the state variables by name. Correct?

simonbyrne commented 4 years ago

nrealelem is used as the block size for launching GPU kernels and nowhere else; when iterating over elements, nvertelem is fixed by the topology and nhorzelem is computed as nelem/nvertelem. Can you confirm that this is what we should use?

You're right it should be

nhorzelem = div(nrealelem, nvertelem)

Can I not just do the following?

tot = sum(h_Q[:,1,:])
MPI.Allreduce(..., MPI_SUM, ...)

This will compute the total sum: I thought you wanted the sum per height?

kpamnany commented 4 years ago

I think for \rho, the total sum is what's needed. But the code snippet you've provided will compute the sum on all the vertical elements? That will be helpful.

simonbyrne commented 4 years ago

yes, exactly.

kpamnany commented 4 years ago

At this point, https://github.com/climate-machine/CLIMA/pull/508 provides an infrastructure that allows for gathering diagnostics in an efficient (for LES) and reasonably succinct fashion. Further evolution will occur when we can run a GCM, and as more diagnostics are implemented. In particular:

Desired diagnostics

(the following list will be updated as we go)

Two-dimensional fields (dependent on horizontal coordinates)

Three-dimensional fields (dependent on horizontal and vertical coordinates) (implemented for each horizontal slice)

Higher-order statistics (horizontal)

tapios commented 4 years ago

Long-Run Goals

As the diagnostics layer develops, it may help to keep in mind some long-run goals for how we want this to operate:

charleskawczynski commented 4 years ago

I'd like to propose that we have an automated way to export the budget (all source/sink/tendency terms) of the equations.

szy21 commented 4 years ago

I'd like to propose some more diagnostics.

Both for GCM and LES configurations of the model, we need to be able to accumulate various statistics (time aggregates) at runtime. For each of these, we should save averages over time windows (e.g., months). Here's a small subsample of what we will need, to get an idea about what we need to be able to extract:

Two-dimensional fields (dependent on horizontal coordinates)

  • Top of atmosphere (TOA) downwelling shortwave flux
  • TOA Upwelling shortwave flux (all sky and clear sky)
  • TOA Upwelling longwave flux (all sky and clear sky)
  • Up- and downwelling shortwave flux at surface (all sky and clear sky)
  • Up- and downwelling longwave flux at surface (all sky and clear sky)
  • Sensible heat flux at surface
  • Latent heat flux at surface
  • Surface air temperature
  • Rain rate at surface
  • Snow rate at surface
  • Cloud fraction (definition required)
  • (Later: sea ice cover, leaf temperature, soil temperature, ...)

Three-dimensional fields (dependent on horizontal and vertical coordinates)

  • Air temperature Ta
  • Total specific humidity q_t
  • Liquid specific humidity
  • Ice specific humidity
  • Relative humidity
  • Cloud fraction
  • Virtual potential temperature
  • Energy E
  • 3 velocity components
  • Various fluxes (covariances), such as vertical buoyancy flux (w b, where b is proportional to the virtual potential temperature), horizontal energy flux (u E and v E) etc.
  • Various variances, such as q_t^2, kinetic energy components u^2, v^2, w^2
  • Some covariances, such as CF Ta_s, where CF is cloud fraction and Ta_s is surface air temperature
  • Budget terms (separated advection and source terms)

Higher-order statistics (horizontal)

  • Frequency with which a given rain rate threshold at the surface is exceeded
  • Third moment (~skewness) of vertical velocity w^3
yairchn commented 4 years ago

Useful LES diagnostics for comparison / calibration of parameterizations:

  1. Profiles (t,z) of horizontal mean of first, second and third powers of all prognostic variables from the model (used to compute first, second and third moments). Horizontal mean of w times thermodynamic variables, i.e. horizontal_mean(wqt). Horizontal mean of the product of thermodynamic variables themselves, i.e. horizontal_mean(Eqt).

  2. Profiles (t,z) of cloud and core conditional statistics: Horizontal mean of the above variables and variable products conditioned on a flag. For cloud variable condition the point-wise ql >0.
    For core variables condition the flag on the point-wise (ql>0 and w>0).

  3. Profiles (t,z) horizontal mean SGS fluxes: SGS fluxes of prognostic variables (i.e. from the numerical diffusion) - low priority I think

  4. Profiles (t,z) TKE and its components.

  5. Profiles (t,z) precipitation sink of qt: simply take the horizontal mean of the qt tendency due to precipitation at each level

  6. Time series (t):

    • Cloud cover: fraction of columns that has ql>0 at some z level from the total number of columns in the domain
    • Liquid Water Path (LWP)
    • Rain Water Path (RWP) - same as LWP but for rain.
    • Rain rates
    • Latent and sensible surface heat fluxes
    • Cloud top: last z level that has ql>0
    • Cloud base: first z level that has ql>0

Links in PyCLES: Cloud top and base

LWP

cloud_cover * note that it is called cloud fraction here by mistake

cloud_fraction * it is called cloud fraction profile here to distinguish from cloud fraction

TKE and components

akshaysridhar commented 4 years ago

@yairchn @tapios @charleskawczynski @szy21 Here is the list of currently available diagnostics from the master branch https://gist.github.com/akshaysridhar/cc7888d27a93237eee4f8373630a0d6a