NCAR / Topo

NCAR Global Model Topography Generation Software for Unstructured Grids
26 stars 17 forks source link

Interpolation to grids with <= 3 km grid spacing produces too smooth of height field #8

Open MiCurry opened 3 years ago

MiCurry commented 3 years ago

When interpolating to grids that with 3 km grid spacing, the topo tool produces a height field which is significantly lower resolution than what is expected. I belive this is caused by the double interpolation from the ~1 km GMTED2010 data to the 3 km cube sphere then to the target grid. Then, with ridge analysis the height field is smoothed even further.

Below are two plots of the height field over conus. The first, is the x5.6488066.grid.nc (15-3 km variable resolution mesh) with the 3 km refinement centered over CONUS with the height field that was created with this tool. (I believe the height field outside of the continent lines is due to a lack of landmask).

The second is GMTED2010 dataset interpolated to the x1.65536002.grid.nc (3 km quasi-uniform grid) gird using the MPAS-Model`s init_atmosphere core.

These files can be found on glade at: /glade/scratch/mcurry/cam-mpas-topo-compare

terrain mpasa15-3conus-topo-tool

terrain 3km-init_atm

Is there a way interpolate GMTED2010 to a higher resolution cube sphere?

adamrher commented 3 years ago

Hi Miles,

Peter is out this week but I can try to help. I agree the height field is too smooth, and I think it's straightforward to use a higher resolution intermediate cubed-sphere grid if that is indeed the issue. But first, can you confirm some settings in experiment_settings.make? In particular,

ncube_sph_smooth_coarse nwindow_halfwidth lregional_refinement

I think I've described to you that you need a refinement factor in the SCRIP file if lregional_refinement=T. And that the other two settings describe the smoothing radius of the base resolution (15km in your case). I'm wondering if it's using the smoothing radius at the base resolution everywhere on the grid?

MiCurry commented 3 years ago

Hey @adamrher thanks for the quick reply. That would be great if you could double check my settings. Here are my experiement settings:

ifeq ($(case),mpas_15-3-conus_Co009_ridge)
  export ncube_sph_smooth_coarse=009
  export output_grid=mpas_15-3
  export grid_descriptor_fname=$(PWD)/cube_to_target/inputdata/grid-descriptor-file/mpasa15-3.conus.desc_SCRIP.210504.nc
  export nwindow_halfwidth=006
  export rdgwin=_Nsw$(nwindow_halfwidth)
  export stitch=-stitch
  export ncube=3000
  export lregional_refinement=.true.
  export rrfac_max=007
  case_found=
endif

You can find my SCRIP file, with rrfac in /glade/work/mcurry/meshes/cam-mpas/mpasa15-3.conus.desc_SCRIP.210504.nc Calculating rrfac for an MPAS grid wasn't exactly as you specified. MPAS does not have a specific dt field for its cells. For instance, it has dcEdge which is the spherical distance between two cells, but not necessarily the cells size. Because MPAS cells are mostly 5, 6, or 7 sided shapes, its hard to calculate accurately determine the grid distance as it might be difference between different vertices between a cell.

However, what I did to calculate rrfac was the calculate the sphere distance between a random grid_corner and the grid_center of the 15-3 km scrip file and then double it, which gives a rough approximation and then divide this calculation by the max distance. An extension to this would be to do this calculation for several corners and take the average.

When looking at the result, the rrfac value was between 1.0 and 7.525. You email said to use 1 for 15 km and 15 / 3 for 3 km.

The code I used to calculate rrfac is here: https://github.com/MiCurry/calc_rrfac/blob/master/calc_rrfac.py

Let me know what your thoughts are on that.

adamrher commented 3 years ago

Those parameters look right. Which is odd b/c your smoothed file (top plot) looks like something you would expect from something closer to 28km grid. As a sanity check, would you be able to plot either the CAM-SE CONUS or the global 14km topo files for comparison?

/glade/p/cesmdata/cseg/inputdata/atm/cam/topo/se/ne30x8_CONUS_nc3000_Co060_Fi001_MulG_PF_RR_Nsw042_c200428.nc /glade/p/cesmdata/cseg/inputdata/atm/cam/topo/se/ne240pg3_nc3000_Co008_Fi001_PF_nullRR_Nsw005_20171015.nc

As an aside, I also see that for SE, we are setting ncube_sph_smooth_coarse=008 and nwindow_halfwidth=5 for the global 14km grid (ne240pg3), so I'd probably use those parameters for ur mpas-conus grid instead.

I also don't understand how this is a problem with the 1km->3km interpolation. I say that b/c you have the 3km intermediate cubed sphere topofile mapped to the mpas grid, right? And it look more-or-less what we want? If so the 3km source gives the right answer on the mpas grid, and so I suspect it's something in the 3km->target grid smoothing, i.e, cube_to_target.

Lastly, I had thought that the mpas var-res grids were derived from an analytical density function rho(x,y). If so I would think another way to compute the rrfac is to evaluate the density function at cell centers and normalize by the lowest density point on the grid. Maybe this is an outdated workflow ... I recall if from an old mpas paper.

adamrher commented 3 years ago

one more suggestion. As u r trying to just debug this, when running cube_to_target, I would turn off the ridge finder by settting: export rdgwin=_NoAniso ... That should cut down cube_to_target to closer to a half-hour execution time (instead of ~3hrs).

MiCurry commented 3 years ago

Thanks for all the help and suggestions @adamrher.

I just tried changing ncube to be 9000, which would result in a 1 km cube sphere but I'm not sure if this will have any effect as its grabbing the 3km topo-data from: /project/amp/juliob/topo-data/gmted2010_modis-ncube3000-stitch.nc. When calling make, does it only run cube_to_target and not bin_to_cube?

The tool I used to generate the plots above specifically use the MPAS grid field. Do you have scrip files for the files below for plot.nc thats in cube_to_target? I'm not able to locate one for either.

ne30x8_CONUS_nc3000_Co060_Fi001_MulG_PF_RR_Nsw042_c200428.nc
ne240pg3_nc3000_Co008_Fi001_PF_nullRR_Nsw005_20171015.nc

As an aside, I also see that for SE, we are setting ncube_sph_smooth_coarse=008 and nwindow_halfwidth=5 for the global 14km grid (ne240pg3), so I'd probably use those parameters for ur mpas-conus grid instead.

I'll give those a try out.

I also don't understand how this is a problem with the 1km->3km interpolation.

The 1 km -> 3 km interpolation might be correct, but our thinking was that if we are interpolating from a 3 km cube sphere to a 3 km target grid then perhaps we are not 100 % preserving the 3 km details of the cube sphere. i.e. the interpolation of 3 km to any grid spaces would have a resolution of lower than the 3 km cube sphere.

That's at least my interpretation of what might be happening. Perhaps because this tool computes overlap weights that it might be better at preserving detail than I suspect?

I say that b/c you have the 3km intermediate cubed sphere topofile mapped to the mpas grid, right? And it look more-or-less what we want?

My understanding was that the 3km source grid was upon a cube sphere. And cube_to_target takes this cube sphere file and interpolates it to the target grid, regardless of the target_grid. Is that correct?

Lastly, I had thought that the mpas var-res grids were derived from an analytical density function rho(x,y). If so I would think another way to compute the rrfac is to evaluate the density function at cell centers and normalize by the lowest density point on the grid. Maybe this is an outdated workflow ... I recall if from an old mpas paper.

The density function is only used when the variable resolution meshes are created and are not stored within the mesh. I believe they are somewhere on some filesystem, but using them in any means would be a lot of unnecessary work. I don't think we want to be using those.

My suggestion for calculating rrfac is to have this tool calculate it in the way it sees is best. Since the target grid is always a SCRIP file, this tool could use the fields present to calculate this field for every mesh (when lregional_refinement=True) I'd suggest this for two reasons:

  1. Reduce user error when calculating or producing it. Calculating the field on the fly will ensure the tool knows exactly what rrfac means
  2. By not adding rrfac to the SCRIP file we are leaving the SCRIP file as it is defined and not adding any non-standard fields.
adamrher commented 3 years ago

Hi Miles, I'll respond to you in detail later. But I wanted to first point you to the requested scrip files: /glade/p/cesmdata/cseg/inputdata/atm/cam/coords/ne0CONUSne30x8_scrip_c200107.nc

the ne240pg3 is not in the repo (@cacraigucar can we do this? You're going to ask me to add metadata aren't ya), but it's here: /glade/work/aherring/grids/SCRIP_files/ne240pg3_scrip_170628.nc

adamrher commented 3 years ago

I just tried changing ncube to be 9000, which would result in a 1 km cube sphere but I'm not sure if this will have any effect as its grabbing the 3km topo-data from: /project/amp/juliob/topo-data/gmted2010_modis-ncube3000-stitch.nc. When calling make, does it only run cube_to_target and not bin_to_cube?

It does not call bin_to_cube. You need to go into the bin_to_cube folder and generate the executable, and then run the run.sh shell script (this doesn't take long at all). You could experiment with ncube=9000 for debugging purposes, but it's not really a solution b/c 3km was chosen to provide SGH30 for the Beljaars pbl form drag scheme. That is, Beljaars is expecting a subgrid variance for length scales below 3km.

But also note that the "stitch" modifier is important in /project/amp/juliob/topo-data/gmted2010_modis-ncube3000-stitch.nc.. The 1km GMTED dataset has an error over the Antarctic peninsula, and so Julio stitched in a corrected topo for that region into the 3km dataset. We don't have the scripts in the TopoCESM2 to reproduce the corrected 3km dataset. I am in the process of fixing this by stitching in the correction to the 1km dataset ... and eventually I'll include the stitching scripts so that the entire package is reproducible. This also means the current SGH30 over the Antarctic peninsula is wrong, which I'm not sure anyone thought about, but even more motivation to get the fix into the 1km dataset.

The 1 km -> 3 km interpolation might be correct, but our thinking was that if we are interpolating from a 3 km cube sphere to a 3 km target grid then perhaps we are not 100 % preserving the 3 km details of the cube sphere. i.e. the interpolation of 3 km to any grid spaces would have a resolution of lower than the 3 km cube sphere.

That's at least my interpretation of what might be happening. Perhaps because this tool computes overlap weights that it might be better at preserving detail than I suspect?

It might be instructive to layout the workflow in cube_to_target. (1) remap the rrfac to the 3km grid, (2) smooth the topo fields on the 3km grid using the smoothing radius determined by ncube_sph_smooth_coarse, which refers to a smoothing radius in terms of the number 3km grid points. (3) map the smoothed field to the target grid.

So you should expect the fields to be smoother than the 3km grid (although I suppose we would need to ensure that ncube_sph_smooth_coarse >= rrfac_max) [edit for clarity - the smoothing radius is scaled by 1/rrfac_max so that there is less smoothing over the refined region ... basically we are trying to ensure that the smoothing radius is about 1.8dx everywhere, where dx is the grid spacing of the target grid, and obviously dx is smaller in the refined region]. What I think might be worth checking is whether rrfac is taking on values of on the order of rrfac_max over CONUS. If this is not defined properly than the smoothing radius will be too larger in the CONUS region, which what first thought of when I looked at your plots. If you want, you can parse through a version of cube_to_target.F90 that I have (/glade/work/aherring/grids/TopoCESM2/cube_to_target/cube_to_target.F90) where I added in a subroutine to write of the rrfac after its mapped to the 3km grid:

!+++ARH
!
! write netCDF file
!
subroutine wrt_cube(ncube,terr_cube,rrfac_cube,output_file)
  use shr_kind_mod, only: r8 => shr_kind_r8
  use shared_vars
  implicit none

You can search initials ARH to find relevant mods, and how I'm calling it in the script.

adamrher commented 3 years ago

If you want a head start for plotting the 3km grid, you could use my ncl script: /glade/work/aherring/grids/TopoCESM2/cube_to_target/output/plot_rrfac.ncl

Although its kind of messy and you'll probably have to uncomment some stuff.

adamrher commented 3 years ago

I forgot to respond to your suggestion to compute rrfac as a part of the TopoCESM2 package. Both of your points are good ones. One problem I can think of is that the spectral element grid is a special case. The SCRIP grid only provides the computational grid, which is highly anisotropic. This is because each element contains unequally spaced GLL nodes, and we invent control volumes around those nodal points that results in large differences in control volumes in a single element. Here's an example of what the control volumes look like in a particular coarse uniform spectral element grid:

se_gll_cv_grid-eps-converted-to

Because of this, I compute the rrfac on the element grid, which is not available from the SCRIP grid file. The tool that does this is in our toolkit (https://github.com/ESMCI/Community_Mesh_Generation_Toolkit), which is our feeble attempt to standardize this for cam-se grids. But I agree with you that it's not ideal for us to be messing with the well established definition of a scrip grid file.

MiCurry commented 3 years ago

@PeterHjortLauritzen @adamrher thanks again for the meeting today.

Here is the plot of rrfac for the 15-3 grid with its refinement over CONUS:

temp_topo

It uses the code in calc_rrfac and calculates rrfac using the sphere distance from the cell center to a randomly selected cell corner (the first grid corner in the cells list) and then takes the ratio of the max computed distance to each distance.

A few things to note about the plot above that might be noteworthy. First is that the SCRIP grid that we use for topography is on a unit sphere. All fields are scaled to be on a unit sphere. In standalone MPAS, we blow the fields up by an earth radius when we apply static, terrestrial fields (terrain, snow-albedo, vegetation fraction etc.).

The other thing is that all latitude and longitude values are in radians and not degrees. I had to convert the latitude and longitude into degrees. Is it possible that that might have an effect?

I think the strange banding at the top of the refined region (and other places) might be matte pattern caused by the plot, but I'm not sure. However, the several darker spots outside the refine region are consistent with the MPAS grid areas. I'm generating a plot of the MPAS grid areas field for a comparison.

Is it possible that the radian latitude and longitude values have an effect or the fact that the SCRIP grid is on the unit sphere?

adamrher commented 3 years ago

@MiCurry can you point me to the SCRIP grid file and the plot_rrfac.ncl script you used to plot this? I want to try a few things to see if the ugly parts are a plotting artifact.

Is it possible that the radian latitude and longitude values have an effect or the fact that the SCRIP grid is on the unit sphere?

The area in the SCRIP files should be in radians^2, and assumed a sphere of unit radius 1 (so the global integral should be 4 pi 1^2). The lat lon values should be in degrees. I'm looking at scrip file that we know works w/ the topo software and I can see that longitudes use the convention [0,360] instead of [-180,180]: /glade/p/cesmdata/cseg/inputdata/atm/cam/coords/ne0CONUSne30x8_scrip_c200107.nc

Which convention do your converted values correspond to?

PeterHjortLauritzen commented 3 years ago

Thanks Miles and Adam! rrfac looks reasonable. It seems likely that the lat-lon's are 180 degrees off => you might be using rough topography 180 degrees displaced (so not over the US but in the low resolution area and smooth topography in the refinement area).

MiCurry commented 3 years ago

@adamrher I'm using the scrip file found in:/glade/scratch/mcurry/rrfac. The SCRIP I used to make the plot is there as well.

The lat and long values in the scrip file are from [0, 2pi], and translated they are from [0, 360]. Perhaps we can update this tool to translate radians to degrees? We could check the units of each field and translate accordingly. SCRIP files can either have radians or degrees as units for its coordinates:

"The unit of the coordinates can be either "radians" or "degrees"."

The areas are in radians^2.

PeterHjortLauritzen commented 3 years ago

@adamrher: can we overwrite height with refinement factor on the 3km intermediate cubed-sphere so we can map height (refinement factor) to the target grid and see if it is displaced or not?

adamrher commented 3 years ago

@MiCurry the ugliness was a plotting issue. I've found that CellFill doesn't like cylindrical equidistant plots for some reason. If I change it to an orthographic projection it's much happier:

      res@mpProjection = "Orthographic"
      res@mpCenterLatF =    30.
      res@mpCenterLonF =  -100.

temp_topo mpas

adamrher commented 3 years ago

@adamrher: can we overwrite height with refinement factor on the 3km intermediate cubed-sphere so we can map height (refinement factor) to the target grid and see if it is displaced or not?

Actually I had suggested that we just output the 3km cubed-sphere rrfac, since I've hacked my cube_to_target.F90 to do that:

If you want, you can parse through a version of cube_to_target.F90 that I have (/glade/work/aherring/grids/TopoCESM2/cube_to_target/cube_to_target.F90) where I added in a subroutine to write of the rrfac after its mapped to the 3km grid:

!+++ARH
!
! write netCDF file
!
subroutine wrt_cube(ncube,terr_cube,rrfac_cube,output_file)
 use shr_kind_mod, only: r8 => shr_kind_r8
 use shared_vars
 implicit none

You can search initials ARH to find relevant mods, and how I'm calling it in the script.

@MiCurry would you be willing to copy those mods into your cube_to_target.F90? If not, we could do what Peter suggested.

MiCurry commented 3 years ago

Here's a plot that uses grid_area to calculate the rrfac (now using a sqrt 😉). Min and max values are 1 and 7.6846.

temp_topo area

@adamrher I can give either that or Peter's suggestion a try.

The Topo software is still taking quite a while to run quite longer than 30 mins (more than a few hours), even with the ridge analysis turned off. Any idea why this might be happening?

adamrher commented 3 years ago

not sure why it's taking so long. I usually run on a login node so I can track where it's at. Do you know where it is stalling?

The area approach seems to be a more straightforward method, and gives marginally cleaner results. Setting rrfac_max=5 will clean up the refinement zone so it's all green.

adamrher commented 3 years ago

Hi @MiCurry, I think you should def. proceed with writing out rrfac on the intermediate cubed sphere grid via the mods I pointed you to ... but, to help debug, I made you a 1.5km intermediate cubed sphere dataset if you want to experiment with that as well: /glade/scratch/aherring/for_miles/gmted2010_bedmachine-ncube6000.nc

I'd set ncube_sph_smooth_coarse twice as large ~ 16 and nwindow_halfwidth to 16/sqrt(2) ~ 11.

This is with the new CAM topo, so the Antarctic peninsula issue is fixed. That means you also have to update the code as I merged some changes necessary to run this dataset last week.

MiCurry commented 3 years ago

Hi @adamrher, yes I think outputing rrfact and seeing if it looks correct is a good next step. And great! Thanks for creating that. I'll give that a try as well (perhaps next week).

If you point me to a branch, I can merge in your code changes.

MiCurry commented 3 years ago

I think I may have found the issue:

Even with lregional_refinment = .true. in experiment_settings.make, lregional_refinment at this point is .false.:

https://github.com/NCAR/Topo/blob/c36f768735c8e24105fde4197bbacdc4e0e9bfcf/cube_to_target/cube_to_target.F90#L200

The following is being reported in make:

NO refinement: RRFAC = 1. everywhere
 MINMAX RRFAC RAW MAPPED FIELD   1.0000000000000000        1.0000000000000000

Its simply not being set in the nlmain.nl and not being read in when nlmain.nl namelist is being read in. Here is the contents of nlmain.nl at that point:

&topoparams
intermediate_cubed_sphere_fname = '/shared/wmr/mcurry/cesmdata/topo/gmted2010_modis-ncube3000-stitch.nc'
lsmooth_on_cubed_sphere = .true.
grid_descriptor_fname = '/mmmtmp/mcurry/topo-temp/cube_to_target/inputdata/grid-descriptor-file/SCRIP_ne0np4CONUS.ne30x8.nc'
ncube_sph_smooth_coarse = 060
ncube_sph_smooth_fine = 001
lstop_after_smoothing = .true.
lfind_ridges=.false.
luse_multigrid = .true.
luse_prefilter = .true.
/

Here is the experiment_settings.make I'm using:

ifeq ($(case),mpas_60-3-cal_ridge)
  export ncube_sph_smooth_coarse=033
  export output_grid=mpas_60-3
  export grid_descriptor_fname=$(PWD)/cube_to_target/inputdata/grid-descriptor-file/mpasa60-3.california.SCRIP.nc
  export nwindow_halfwidth=023
  export rdgwin=_Nsw$(nwindow_halfwidth)
  export stitch=-stitch
  export ncube=3000
  export lregional_refinement=.true.
  export rrfac_max=020
  export rdgwin=_NoAnsio
  case_found=
endif

However, it looks like lregional_refinement is being output into final_CONUS_30_x8_nc3000_Co060_Fi001_MulG_PF_nullRR_v02_Nsw042.nl, but that namelist does not match the one that is read in above (and is not read in).

Does someone else want to confirm this?

adamrher commented 3 years ago

ruh roh. I'll try and run the arctic grid when I have time and update this issue w/ what I find. glad you found the issue!

MiCurry commented 3 years ago

@adamrher Actually I think I was wrong with that. Not sure why its not working for my current case.. But let me know if you see it work or not work for you.

MiCurry commented 3 years ago

Running the artic case I see that lregional_refinment is set to .true., things work fine, but then running my MPAS case lregional_refinment is set to .false..

MiCurry commented 3 years ago

Sorry about the delay getting back, I was taking PTO on Thursday and Friday. I re-plotted rrfac on the cube sphere and got the same as what I got before. I think what is occurring is that terrain is being interpolated to the cube sphere and then it is being limited to the specified max rrfac, 20. Hence, we see values of 20 on all the land masses, 0 on the ocean, and negative terrain values (e.g the Caspian sea). During the run, the following is reported:

 MINMAX RRFAC SMOOTHED  -27.022120715640462        5397.1788561107951
 MINMAX RRFAC FINAL  -27.000000000000000        20.000000000000000

Hence confirming my suspicions.. This function call is most likely incorrect:

https://github.com/NCAR/Topo/blob/c36f768735c8e24105fde4197bbacdc4e0e9bfcf/cube_to_target/cube_to_target.F90#L355-L364

rrfac-cube

For that run, I got the following terrain, which is what we have seen before:

ter-60-3km-conus-camTopo-rrfac

Out of curiosity I tried running the 60km - 3 km without lregional_refinemnet = True and I got something with better resolved terrain. ncube_sph_smooth_coarse=010 and nwindow_halfwidth=007 and with ncube_sphere_smooth_coarse=005 and nwindow_halfwidth=007 is even better:

ter-60-3km-conus-camTopo-norrfac-5sm-3nwin

For comparison, here is the interpolated terrain for the 60-3 km using standalone MPAS.

ter-60-3km-conus-standAlone-mpas

The global plots of the standalone MPAS and CAM Topo run with no regional regiment i.e. outside of the 3 km region.

MiCurry commented 3 years ago

Here are the global plots of the CAM topo tool with lregional_refinement off compared to standalone MPAS interpolation:

ter-60-3km-global-camTopo-norrfac

And a comparison stand-alone MPAS (the landmask of this file is not taken into account):

ter-60-3km-global-standAlone-mpas

adamrher commented 3 years ago

@MiCurry Ah, I didn't realize that the plot that looks like a land fraction plot is actually the rrfac. That went over my head last week. But I see how that can happen if rrfac is taken on the values of terrain height.

The way smooth_intermediate_topo_wrap is supposed to work for the code block you've highlighted, is it smoothes rrfac since rrfac is too choppy after we map it from the SCRIP grid to the 3km intermediate cube sphere. The subroutine takes the first argument ("rrfac_tmp") and smoothes it according to the smoothing radius ("NSCL_c"), with the radius modified by the last argument which is the refinement factor ("rrfac_tmp"). In my mind the problem is either (1) for some reason it is not running this routine and grabbing the smoothed .dat file of the terrain height instead, or (2) rrfac_tmp does not contain the refinement factor, but rather the terrain height for some reason.

What does the min/max value of the raw mapped rrfac give you in the logs (see relevant code block below) ? If it seems like it's the values of terrain height, then maybe it's not reading in rrfac from the SCRIP file correctly in shared_vars.F90, called target_rrfac.

  allocate( rrfac(ncube,ncube,6)  )
  if (lregional_refinement) then
    !--- remap rrfac to cube
    !-----------------------------------------------------------------
    do counti=1,jall
      i    = weights_lgr_index_all(counti)!!
      !
      ix  = weights_eul_index_all(counti,1)
      iy  = weights_eul_index_all(counti,2)
      ip  = weights_eul_index_all(counti,3)
      !
      ! convert to 1D indexing of cubed-sphere
      !
      ii = (ip-1)*ncube*ncube+(iy-1)*ncube+ix!
      !
      wt = weights_all(counti,1)
      !
      rrfac(ix,iy,ip) = rrfac(ix,iy,ip) + wt*(target_rrfac(i))/dA(ix,iy)
    end do
  else
     rrfac(:,:,:) = 1.
     write(*,*) " NO refinement: RRFAC = 1. everywhere "
  endif
  write(*,*) "MINMAX RRFAC RAW MAPPED FIELD",minval(rrfac),maxval(rrfac)
!---ARH