cf-convention / vocabularies

Issues and source files for CF controlled vocabularies
0 stars 0 forks source link

Standard names: Propose new additions to the standard names table (MODIS output from COSP simulator) #52

Open brandonduran opened 2 months ago

brandonduran commented 2 months ago

CF-Convention Discussion Proposal: Proposer's name: Brandon Duran Date: 25-6-2024

I would like to propose standard names for MODIS output from the COSP satellite simulator package, including new joint histograms of cloud droplet effective radius (CER) and cloud water path (CWP). These diagnostics are similar to those from the ISCCP satellite simulator, but feature distinction by cloud thermodynamic phase. All of the proposed names below are for joint histograms summarizing the co-variability of different cloud properties (cloud top pressure, cloud optical depth, cloud water path, cloud droplet effective radius).

Naming is guided by conventions for the ISCCP satellite simulator; specifically, clisccp, which is a 7x7 (cloud top pressure x optical depth) matrix. As such, I propose that all MODIS cloud top pressure by optical depth histograms follow this naming, with the base of clmodis and any additional modifiers. To distinguish the new CER-CWP histograms, I propose the modifier ‘cwpr’, such that the base 6x7 (CER x CWP) joint histogram is named _clmodiscwpr.

The MODIS optical depth (tau) bounds are as follows: 0-0.3, 0.3-1.3, 1.3-3.6, 3.6-9.4, 9.4-23, 23-60, >60. The MODIS cloud top pressure (CTP) bounds are as follows [hPa]: 800 and higher, 800-680, 680-560, 560-440, 440-310, 310-180, 180-0. CTP and tau bounds match bounds from the ISCCP clisccp output.

The MODIS cloud liquid water path (LWP) bounds are as follows [g/m2]: 0-10, 10-30, 30-60, 60-100, 100-150, 150-250, >250. The MODIS cloud ice water path (IWP) bounds are as follows [g/m2]: 0-20, 20-50, 50-100, 100-200, 200-400, 400-1000, >1000. The MODIS liquid cloud droplet effective radius (CER) bounds are as follows [𝜇m]: 4-8, 8-10, 10-12.5, 12.5-15, 15-20, >20. The MODIS ice cloud ice-crystal effective radius bounds are as follows [𝜇m]: 5-10, 10-20, 20-30, 30-40, 40-50, >50.

The proposed name for the joint histogram diagnostics are: Term Long Name Units

  1. clmodis _modis_cloud_areafraction 1 The MODIS cloud area fraction is diagnosed from atmosphere model output by the MODIS simulator software in such a way as to be comparable with the observational diagnostics of MODIS (Moderate Resolution Imaging Spectroradiometer). Cloud area fraction is also called “cloud amount” and “cloud cover.” As seen from above, mean fraction of grid column occupied by cloud of optical depths and heights specified by the tau and pressure intervals given above. Dimensions of the histogram are cloud top pressure and cloud optical depth (7x7).

  2. clmodis_liquid _modis_cloud_area_fractionliquid 1 Liquid means liquid-topped clouds, as seen by the MODIS simulator. Dimensions of the histogram are cloud top pressure and cloud optical depth (7x7).

  3. clmodis_ice _modis_cloud_area_fractionice 1 Ice means ice-topped clouds, as seen by the MODIS simulator. Dimensions of the histogram are cloud top pressure and cloud optical depth (7x7).

  4. clmodis_cwpr_liquid _modis_cloud_area_fraction_cloud_water_path_effective_radiusliquid 1 Liquid means liquid-topped clouds, as seen by the MODIS simulator. Dimensions of the histogram are cloud liquid water path and cloud droplet effective radius (7x6).

  5. clmodis_cwpr_ice _modis_cloud_area_fraction_cloud_water_path_effective_radiusice 1 Ice means ice-topped clouds, as seen by the MODIS simulator. Dimensions of the histogram are cloud ice water path and cloud ice-crystal effective radius (7x6).

Thank you!

github-actions[bot] commented 2 months ago

Thank you for your proposal. These terms will be added to the cfeditor (http://cfeditor.ceda.ac.uk/proposals/1) shortly. Your proposal will then be reviewed and commented on by the community and Standard Names moderator.

efisher008 commented 2 months ago

Dear Brandon,

Thank you for your proposal. I have now added the names to the CF editor. Thanks for your patience as the editor had been experiencing some technical issues, which are now resolved. I have used the long names as the "interim" names (as abbreviations are not generally accepted in names aside from where the usage is well-defined) , but there will need to be some discussion on the format of these, in particular the longer names to ensure they are comprehensible and consistent with existing names.

You can view the entries here:

  1. modis_cloud_area_fraction - https://cfeditor.ceda.ac.uk/proposal/5355/edit
  2. modis_cloud_area_fraction_liquid - https://cfeditor.ceda.ac.uk/proposal/5357/edit
  3. modis_cloud_area_fraction_ice - https://cfeditor.ceda.ac.uk/proposal/5354/edit
  4. modis_cloud_area_fraction_cloud_water_path_effective_radius_liquid - https://cfeditor.ceda.ac.uk/proposal/5356/edit
  5. modis_cloud_area_fraction_cloud_water_path_effective_radius_ice - https://cfeditor.ceda.ac.uk/proposal/5353/edit

Should the matrix dimensions (i.e. numbers in brackets at the end of the description) be included in the editor entries?

Best regards, Ellie

brandonduran commented 2 months ago

Hi Ellie,

No worries about the delay. I think it is fine to omit the dimensions. Although the MODIS output differs from ISCCP output in that not all histograms share the same (7x7) shape, this would simply be reflected in the output of these proposed variables and probably does not need to be included in the editor entries.

Thanks! -Brandon

taylor13 commented 2 months ago

Without a careful analysis, it would be nice to make the names a little easier for humans to parse. For example, can somehow "liquid cloud top" be worked into the name? (I must confess I can't come up with anything that works.)

brandonduran commented 2 months ago

Perhaps the names could be modified as such, although they do become a mouthful:

  1. modis_cloud_area_fraction_liquid --> modis_cloud_area_fraction_liquid_topped
  2. modis_cloud_area_fraction_ice --> modis_cloud_area_fraction_ice_topped
  3. modis_cloud_area_fraction_cloud_water_path_effective_radius_liquid --> modis_cloud_area_fraction_cloud_water_path_effective_radius_liquid_topped
  4. modis_cloud_area_fraction_cloud_water_path_effective_radius_ice --> modis_cloud_area_fraction_cloud_water_path_effective_radius_ice_topped

Another variation of this convention could be:

  1. modis_cloud_area_fraction_liquid --> modis_liquid_topped_cloud_area_fraction etc., which retains the cloud area fraction expression, but rearranges the position of 'ice' and 'liquid' to more directly specify that we are referring to liquid- and ice-topped clouds separately.

I would strongly suggest leaving modis_cloud_area_fraction unchanged, as this diagnostic is not partitioned by cloud-top phase. At most, it could be modified to modis_total_cloud_area_fraction

taylor13 commented 2 months ago

I liked your suggestions and agree with your last remark. Let's see what others think.

efisher008 commented 2 months ago

Hi @brandonduran and @taylor13,

I agree with the variation stated in @brandonduran's post: having the liquid/ice-topped component earlier in the term e.g. modis_liquid_topped_cloud_area_fraction seems sensible as it is more understandable that we are distinguishing between liquid- and ice-topped clouds with the name variations. There are currently no names in the CF standard names table with modis as a component, so this will be an opportunity to establish a new precedent for these sort of variables going forward.

As a compromise which does not break up the modis_cloud_area_fraction string but still has a more intelligible order, how does the format {ice_topped/liquid_topped}_cloud_area_fraction{_etc.} sound?

Best, Ellie

JonathanGregory commented 2 months ago

Dear @brandonduran

Thanks for working on this.

If these quantities are histograms, the standard name should be histogram_of_X, where X is the variable that has been histogrammed. This pattern is in the guidelines, and there are a couple of existing standard names which use it. The reason for this is that a histogram of X isn't the same geophysical quantity as X itself. The histogram is a dimensionless (in the sense of being a pure number) number of counts in the bin, whereas X could have any dimension (i.e. any canonical unit) e.g. a histogram of temperature has units of 1, not K. Area fraction is dimensionless anyway, so the unit is not affected in your case, but a histogram of cloud area as a function of cloud-top pressure and cloud optional depth is not the same quantity as cloud area as a function of the same two variables. The former is a count, which could be zero or any positive integer, while the latter is a floating-point number between 0 and 1.

If these are all histograms, I suppose that (1) is histogram_of_cloud_area_fraction, (2) and (4) are histogram_of_liquid_water_topped_cloud_area_fraction, and (3) and (5) histogram_of_ice_topped_cloud_area_fraction. (Other standard names use the liquid_water, not just liquid, to describe clouds.) I'm mot sure I've understood this correctly, but it looks like (2) and (4) are distinguished only by the dimensions of the histogram (array dimensions, that is, not the same sense of "dimension" as above), likewise (3) and (5). Those pairs can each have the same standard name, because the physical quantity is the same. That is perfectly fine for CF, since the coordinate variables are also metadata.

Alternatively, the standard names could say what the dimensions are. The guideline allows for that as well with _over_Y; these are two-dimensional histograms, so they'd be _over_Y_and_Z, I suppose. If these were probability density functions, rather than histograms, it would be necessary to identify the dimensions, because they affect the unit. For example, the units of probability density of cloud area fraction as a function of liquid water path and effective radius are (kg m-2)-1 m-1 = kg-1 m.

Best wishes

Jonathan

brandonduran commented 2 months ago

Hi @JonathanGregory , thanks for your very thorough review. I will attempt to clarify in what follows.

These variables can all be thought of variations of the CMIP6 clisccp variable, or what is often called FISCCP1_COSP. As such, their units are really percentages, such that summing up the histogram over all bins yields the total grid-box cloud fraction (as a percentage). In this sense, these are not true 'histograms' and are more accurately described as _cloud area percentages as a function of different cloud _properties;__ I simply have adopted the traditional nomenclature of referring to these as ISCCP or MODIS joint histograms. This artifact is a result of the observational version of these quantities, which indeed are pixel counts of clouds, representing a true histogram. Your clarification is important though, as the method in which these diagnostics have currently been implemented in the COSP code package results in them being reported in percentages, not fractions. Therefore, I will modify all that follows to represent these as cloud area percentages %, rather than cloud area fractions 1. I'm not sure if this is a necessary change, as my understanding is that isccp_cloud_area_fraction is still reported in units of %, rather than 1, so please advise regarding this. Apologies for the confusion and mistake on my end.

In light of this, it seems like the histogram_of_ prefix would not be an accurate description of these variables. Your second point, however, is correct. (2) and (4), and (3) and (5) are only distinguished by the array dimensions of the 'histogram.' They represent the same physical quantity, but differ in how they are partitioned (cloud-top pressure x optical depth, cloud water path x effective radius).

Incorporating your thoughts and @efisher008 suggestions:

  1. modis_cloud_area_percentage or modis_total_cloud_area_percentage
  2. modis_ice_topped_cloud_area_percentage , coordinate variables: (cloud-top pressure x optical depth)
  3. modis_liquid_topped_cloud_area_percentage, coordinate variables: (cloud-top pressure x optical depth)
  4. modis_ice_topped_cloud_area_percentage, coordinate variables: (cloud water path x effective radius)
  5. modis_liquid_topped_cloud_area_percentage, coordinate variables: (cloud water path x effective radius)

Note that now (2) and (4), and (3) and (5) share the same standard name due to their representation of the same physical quantity. For CF, they would be distinguished by their differing coordinate variables in the metadata. I think that we should retain modis at the front of the standard names, following the standard for other COSP output, where the name of the simulator precedes the geophysical quantity (ie, isccp_cloud_area_fraction). I am not averse to rearranging, though, given that this is the first instance of the modis qualifier .

Thanks for the discussion and happy to clarify the above further.

JonathanGregory commented 1 month ago

Dear @brandonduran

Thanks for your careful explanation. This is interesting! I'm not sure I understand yet exactly what these quantities are. "Area fraction of X" (where X is something that is either present or absent at a given location, e.g. cloud, land or sea-ice) means the area (canonical unit m2) where X is found divided by the area (also m2) considered. For example, land_area_fraction in a cell of a latitude--longitude grid means the area occupied by cloud in the gridbox divided by the area of the gridbox.

Your coordinates are cloud-top pressure and optical depth, not longitude and latitude, but a cloud area fraction should have the same meaning. You divide up the area of the globe into a grid of 7x7 in these quantities. You could express any quantity in these coordinates e.g.

dimensions:
  ctp=7;
  tau=7;
  two=2;
variables:
  float ctp(ctp);
    ctp:standard_name="air_pressure_at_cloud_top";
    ctp:units="Pa";
    ctp:bounds="ctp_bounds";
  float ctp_bounds(ctp,two);
  float tau(tau);
    tau:standard_name="atmosphere_optical_thickness_due_to_cloud";
    tau:units="1";
    tau:bounds="tau_bounds";
  float tau_bounds(tau,two);
  float rsut(ctp,tau);
    rsut:standard_name="toa_outgoing_shortwave_flux";
    rsut:units="W m-2";
    rsut:cell_methods="area: mean";
    rsut:cell_measures="area: ctptauarea";
  float ctptauarea(ctp,tau);
    ctptauarea:standard_name="cell_area";
    ctptauarea:units = "m2";
  float caf(ctp,tau);
    caf:standard_name="cloud_area_fraction";
    caf:units="1";
    caf:cell_methods="area: mean";
    caf:cell_measures="area: ctptauarea";

rsut is calculated by the GCM as a latitude--longitude field in the first instance. We assign each latitude--longitude gridcell to one of the (ctp,tau) cells. The area of each (ctp,tau) cell is the sum of the areas of the latitude--longitude cells that are assigned to it, and I've shown that the area is stored in ctptauarea in case it's useful. We multiply rsut in each latitude--longitude gridcell by the area of the cell, giving a quantity in W, we assign these products to the (ctp,tau) cells and add them up within those cells. Now for each (ctp,tau) cell we have a quantity in W and its area in m2. We divide the former by the latter to get the rsut in (ctp,tau) cells in W m-2. This is the area-mean TOA outgoing shortwave flux as a function of cloud-top pressure and cloud optical depth.

The GCM also produces a cloud area fraction on its latitude--longitude grid, which specifies the fraction of the area of each cell which is occupied by cloud. We assign this field to the same set of (ctp,tau) cells. Each (ctp,tau) cell has an area (the number in ctptauarea, the same as before) and a cloud area, which is the sum of the area of cloud contained in all of the latitude--longitude cells assigned to the (ctp,tau). Both of these are areas. By dividing the cloud area by the gridcell area, we obtain caf, the cloud area fraction as a function of cloud-top pressure and cloud optical depth.

Is caf the same sort of quantity as the ones you are using? If so, I agree they are cloud area fractions! Actually the canonical unit is 1 and they should be called fraction, not percentage, as you first had it. % means 0.01, which is dimensionally the same as 1, so it's fine if you provide the data in units="%". Sorry about that confusion.

We probably had a discussion over ISCCP about why it's necessary to put isccp_ at the front of the name, and it's the same with MODIS. Is there some reason why a modis_cloud_area_fraction isn't the same geophysical quantity as a cloud_area_fraction in a GCM or as generally understood? If they are supposed to be comparable geophysical quantities, they ought to have the same standard name. That is the main purpose of standard names.

Best wishes and thanks for your work on this

Jonathan

brandonduran commented 1 month ago

Hi @JonathanGregory,

Here are some answers to the questions you've posed:

_why a modis_cloud_area_fraction isn't the same geophysical quantity as a cloud_areafraction in a GCM

They aren't the same quantity because the latter is the model native cloud fraction, whereas the former is the cloud area fraction as detected by the MODIS satellite simulator. Because the simulator is meant to reproduce what the MODIS satellite observes in reality, it is not inherently meant to capture the same quantity. In principle, the simulator should faithfully reproduce the satellite (including all of its biases), and is simulating what the MODIS satellite would retrieve when looking at the model atmosphere. It is looking at the same population of clouds captured by _cloud_areafraction, but is then reporting its own _cloud_areafraction given its method of sampling / detection. The same is true for the ISCCP simulator. Both MODIS and ISCCP cloud_area_fraction are different from GCM cloud_area_fraction and from each other because of differences in retrieval, detection, etc. Hopefully that clears up the difference.

Is caf the same sort of quantity as the ones you are using? From your description, I believe so. In a given grid cell, the MODIS simulator captures a certain cloud_area_fraction, which is then partitioned into the bins of the histogram (ie, ctp and cot) such that the sum of over all histogram bins gives the grid cell cloud_area_fraction. From your clarification, we should keep the names as cloud_area_fraction rather than cloud_area_percentage (still units of %). Each histogram bin thus contains the area of clouds in a grid cell that fall within the respective bounds of said bin divided by the area of the grid cell.

I'll include below a truncated example of this output for one of the four histograms proposed above, as it is currently implemented in a few GCMs, in case that is useful:

dimensions:
   lat = 180 ;
   lon = 360 ;
   cosp_lwp_modis = 7;
   cosp_reffliq = 6 ;
   time = 12 ;
float CLMODIS_LWPR(time, cosp_reffliq, cosp_lwp_modis, lat, lon);
   units = "%"
   cell_measures = 'area: area'
   cell_methods = 'time: mean'

cosp_lwp_modis (length n) and cosp_reffliq (len m) represent the midpoints of the histogram bins. Accordingly, there are complementary coordinates of cosp_lwp_modis_bnds (length n+1) and cosp_reffliq_bnds (len m+1), which give the bin edges for the respective histograms. These midpoints and bounds are intended to match the corollary quantities of the observational data.

I'm tagging in @caseywall7926 , who helped with the development of these histograms. Casey, any additions?

Hopefully we are converging to closer agreement, but also happy to continue clarifying! Thanks!

caseywall7926 commented 1 month ago

Hi all,

Thanks for this discussion. The MODIS diagnostics represent the fraction of gridbox area occupied by clouds, so they are indeed cloud area fractions. The MODIS diagnostics each represent the cloud area fraction as a function of two different cloud properties (cloud-top pressure vs. cloud visible optical thickness or cloud particle effective radius vs. cloud water path). The MODIS diagnostics also have different variables for liquid-topped clouds and ice-topped clouds, as reported by the MODIS simulator. In other words, the MODIS diagnostics are similar to caf but are expressed as a function of different cloud properties rather than just the overall cloud area fraction.

As Brandon said, the MODIS diagnostics are very similar to the ISCCP variable clisccp, which represents the cloud area fraction as a function of cloud-top pressure and cloud visible optical thickness. (One point of clarification is that these are passive satellite instruments, so they report the cloud-top pressure of the highest cloud in the column and the total column cloud visible optical thickness for each pixel.) The MODIS diagnostics are different from clisccp because they separate the data into liquid-topped clouds and ice-topped clouds and some of the variables express the cloud area fraction as a function of cloud-particle-size vs. cloud-water-path rather than cloud-top-pressure vs. cloud-optical thickness.

Best, Casey

JonathanGregory commented 1 month ago

Dear @brandonduran and @caseywall7926

Thanks for your helpful clarifications. I think that answers my questions. These quantities are indeed cloud_area_fractions of various kinds, canonical unit 1 (but % could be used for units), and I understand why it's necessary to prefix them with modis_. I believe that we've agreed you need three modis_ standard names, then - right?

I hadn't appreciated that the fields are functions of latitude and longitude as well as the two cloud-related variables. That makes sense. I think they should also be mentioned in the cell_methods as well. Imagine that the data had much higher resolution for effective radius and liquid water path. You would derive the field you have at the coarser resolution in these dimensions by aggregating many 2D cells into one 2D cell by meaning them, wouldn't you. Since mean is a linear operation, if there is no missing data it wouldn't matter whether you did the axes separately and in which order, or both at once. If there could be missing data, it does matter. Actually the data doesn't come from a coarse-graining operation like that, but from binning to make a two-dimensional histogram. Since both dimensions are processed at once in making the histogram, I think area: mean cosp_lwp_modis: cosp_reffliq: mean would be the most accurate description of the process. If the histogramming is done timestep by timestep, and the monthly means calculated later, time: mean should appear last.

Best wishes

Jonathan

caseywall7926 commented 1 month ago

Hi @JonathanGregory,

The histogramming is done timestep by timestep with simulated "satellite pixel data" that are meant to be smaller than the gridbox, and monthly means of the histograms are calculated later. In case it helps to clarify, the calculation of simulated "satellite pixel data" and the averaging process are described in section 4b and 4c of Pincus et al. (2012), respectively.

JonathanGregory commented 1 month ago

Thanks, @caseywall7926. That confirms my suggestion that cell_methods="area: mean cosp_lwp_modis: cosp_reffliq: mean time: mean" for these quantities. (You didn't ask about that, I know, but I was prompted by Brandon's CDL example.)

caseywall7926 commented 1 month ago

Great, thanks @JonathanGregory for your suggestions and discussion to clarify the description of these diagnostics!

Best, Casey

efisher008 commented 1 month ago

Dear @brandonduran @caseywall7926 @JonathanGregory,

Thank you all for the detailed discussion and it looks as though you have come to a common conclusion about the description of the MODIS diagnostics represented by three standard names. Would you please summarise what the agreed format of these standard names will be, and which original names they are "descended from" in this sense? Is the following correct, and is there anything further to add to the descriptions?

modis_cloud_area_fraction: The MODIS cloud area fraction is diagnosed from atmosphere model output by the MODIS simulator software in such a way as to be comparable with the observational diagnostics of MODIS (Moderate Resolution Imaging Spectroradiometer). Cloud area fraction is also called “cloud amount” and “cloud cover.” As seen from above, mean fraction of grid column occupied by cloud of optical depths and heights specified by the tau and pressure intervals given above. Dimensions of the histogram are cloud top pressure and cloud optical depth.

modis_ice_topped_cloud_area_fraction: Ice means ice-topped clouds, as seen by the MODIS simulator.

modis_liquid_topped_cloud_area_fraction: Liquid means liquid-topped clouds, as seen by the MODIS simulator.

Best wishes, Ellie

brandonduran commented 1 month ago

Hi @JonathanGregory @caseywall7926 @efisher008 ,

I believe the 3 standard names posted above are what we agreed upon. As established, both modis_ice_topped_cloud_area_fraction and modis_liquid_topped_cloud_area_fraction have two versions that differ in their coordinate variables, but as Jonathan pointed out, they can share a common name.

The modis_ prefix descends from the isccp_cloud_area_fraction standard. To distinguish that these are cloud area fractions as seen by a specific satellite instrument simulator, not the same as cloud area fractions diagnosed by the native model, the prefix format of satellite name_ is employed. The three standard names, rather than one, are proposed because the MODIS simulator is able to partition clouds by their thermodynamic phase. The ISCCP simulator does not have this capability.

Hope this covers it all! Cheers, -Brandon

efisher008 commented 1 month ago

Hi @brandonduran,

As it looks like the names and descriptions have been agreed, if no further comments or feedback have been received after 7 days, these will be accepted and published in the next version of the standard names table, which is expected to be released in August.

Thanks, Ellie

efisher008 commented 1 month ago

Dear @brandonduran,

As the 7-day period has passed, these names have now been accepted and will be released in v86 of the standard names table, scheduled for the second half of August. Thank you again for your proposal.

Best wishes, Ellie

brandonduran commented 1 month ago

Great! Thank you so much @efisher008

efisher008 commented 1 month ago

Dear @brandonduran,

This is just to make you aware that the following text has been added to the description of your three accepted MODIS names to clarify the definition of the phrase area_fraction (as decided in #24): _"Area fraction" is the fraction of a grid cell's horizontal area that has some characteristic of interest. It is evaluated as the area of interest divided by the grid cell area, or if the cell_methods restricts the evaluation to some portion of that grid cell (e.g. "where seaice"), then it is the area of interest divided by the area of the identified portion. This does not require any action from you.

Best, Ellie