BigelowLab / fvcom

Convenience tools for accessing FVCOM datasets.
MIT License
5 stars 1 forks source link

Feature Req/Progress: Interpolation within FVCOM mesh triangles #7

Open adamkemberling opened 1 month ago

adamkemberling commented 1 month ago

Hi all!

I met with some of the Chen lab postdocs last week briefly and am now working on implementing some workflows in R based on what we chatted about. They're primarily a matlab group so I have some matlab code that I'd like to implement in R/python.

You may have already explored this functionality, but I am looking to add the ability to estimate/interpolate point values within triangles based on distance weights to the correct nodes. Ideally in a vectorized way.

They use this approach to reproject data from the newer meshes to the older GOM3 meshes, when looking to create a cohesive full timeline. It also could be applied for point extractions, or for re-projecting to a regular grid given equally spaced coordinates.

So far I am leveraging get_mesh_geometry() and st_join() to handle the triangle-to-point matching, and I'm working on calculating barycentric coordinates for the relevant nodes.

In my head the triangle matching and weights could/should probably be separate from pulling the proper nc variables at the correct time steps since the mesh coordinates don't change over time. Then maybe a second function that does the date/time node lookups and applies the weighted interpolation.

You may have already done this, so I wanted to flag it for potential collaboration/team problem-solving.

Cheers!

adamkemberling commented 1 month ago

This is the general what I was getting at. Need to write it up into generalized functions, but this was the goal: image

btupper commented 1 month ago

Perhaps a dumb question, but why not rasterize onto a regular grid?

adamkemberling commented 1 month ago

Not a dumb question.

I/We keep getting dragged into somewhat unproductive discussions around what resolution to rasterize it to, and how the regridding to raster should be performed. We have a project working at both nearshore and offshore scales and there are some tradeoffs between using a higher resolution nearshore to take advantage of the higher node densities there that might poorly represent the sparser coverage offshore. That and file size or computation speeds if we make a 4km resolution product for the whole shelf.

Then there was a second roundabout discussion on which method to use for interpolating between the points to: IDW, Kriging, what data to use to fit those, is that introducing biases that are inconsistent with the FVCOM data itself? etc.

I asked the Chen lab what methods they recommend/use when regridding and they responded by asking me why I would even want to do that. They shared that they use linear interpolation methods. And that they would interpolate GOM3 node locations for time periods covered by newer FVCOM versions to create a consistent timeseries: https://github.com/SiqiLiOcean/matFVCOM/blob/main/interp_2d_calc_weight.m

So I embarked on this journey to recreate what they use and similar triangle-based methods to avoid overthinking and needing to defend complex interpolation solutions. The interpolation from nearest nodes seems most true to the model outputs (doesn't prescribe some spatial covariance structure) and should scale to many points well. Though, I don't expect it to give meaningfully different results than rasterizing to a thoughtful resolution based on some project needs.

adamkemberling commented 1 month ago

Point of clarification*

The primary utility of this or other interpolation methods would be to extract a value or values that do not fall directly on the mesh. This can be extended to recreate a regularly spaced grid, as an alternative method to rasterizing, but primarily it would be for getting point values directly. This approach would avoid a need to first either fill that polygon or rasterize first.

btupper commented 1 month ago

So, using sf we can determine which polygon a point falls into, and then what? Use something like this? How do we go from cartesian to barycentric coordinates? If we can do that, then the interpolation looks easy-ish.

adamkemberling commented 1 month ago

Exactly, I have some messy code for doing either linear interpolation or barycentric that I need to clean up. The lead into either is the same use of sf::st_join(). The weights are method dependent and could be function options (linear, IDW, barycentric).

Then when applying the interpolation you could idealy select the relevant file or timestep and variable(s) from the nc connection, to get the interpolated values:

These names are a mess, sorry:

# Function to add the linear interpolation weights based on node coordinates
triangulation_linear_weighting <- function(pts_sf, fvcom_mesh){

    # Identify the triangles that overlap each point:
    # Use st_join to assign elem, p1, p2, p3 IDs to the points
    pts_assigned <- st_join(
      st_transform(pts_sf, st_crs(fvcom_mesh)),
      gom3_mesh, 
      join = st_within) %>% 
      drop_na(elem)

    # Iterate over the rows to add weights:
    pts_weighted <- pts_assigned %>% 
     base::split(., seq_len(nrow(.))) %>%
     purrr::map_dfr(function(pt_assigned){

      # Subset the relevant triangle from st_join info
      triangle_match <- fvcom_mesh[pt_assigned$elem,]

      # Build matrices for point to interpolate & of surrounding points:

      # Matrix for triangle
      # Use the triangles node coordinates from the sf geometries
      # Creates 3x3 matrix: row1 x coords, row 2, y coords, row three rep(1,3)
      node_vertices <- t(st_coordinates(triangle_match[1,])[1:3,1:3])

      # Make matrix from the points:
      # creates 3x1 matrix: x, y, 1
      point_coords <- matrix(
        c(st_coordinates(pt_assigned[1,]), 1), 
        nrow = 3)

      #### For Linear Interpolation:

      # Get inverse of the matrix
      inverse_coordmat <- solve(node_vertices)

      # Solve for the weights
      node_wts <- inverse_coordmat %*% point_coords %>%
        t() %>% 
        as.data.frame() %>% 
        setNames(c("p1_wt", "p2_wt", "p3_wt"))

      # Return with dataframe
      bind_cols(pt_assigned, node_wts)

    })
    # End Rowwise
    return(pts_weighted)
}

# Run for all points:
pts_weighted <- triangulation_linear_weighting(
  pts_sf = pts_df, 
  fvcom_mesh = gom3_mesh)

Then for interpolation:

####  Apply Interpolation Step  ####
interpolate_from_weights <- function(df_weighted, fvcom_nc, fvcom_varid, fvcom_siglev = 1, var_out = "interp_val"){

  # Get the values of the variable of interest as vector
  node_vals <- ncvar_get(
    nc = fvcom_nc, 
    varid = fvcom_varid)[,fvcom_siglev]

  df_weighted %>% 
    mutate(
      {{var_out}} := node_vals[p1] * p1_wt + node_vals[p2] * p2_wt + node_vals[p3] * p3_wt
  )

}

# Run the interpolation
pts_interp <- interpolate_from_weights(
  df_weighted = pts_weighted, 
  fvcom_nc = gom3_early, 
  fvcom_varid = "temp", 
  fvcom_siglev = 1,
  var_out = "surf_temp")
adamkemberling commented 1 month ago

For barycentric I was using geometry::cart2bary(). The input matrices are a little different than above, but can be constructed from the same inputs:

# For cart2bary you need this information as a "simplex", coordinates as a matrix basically
triangle_coord_test <- triangle_matches[1,] %>% st_coordinates()
triangle_coord_test <- triangle_coord_test[c(1:3), c(1:2)]

# Make the simplex for triangle, and another matrix for the the point coords
simplex_test <- as.matrix(triangle_coord_test)
point_test <- pts_assigned[1, c("lon", "lat")] %>% st_drop_geometry() %>% as.matrix()

# Get barycentric coordinates, label them
node_numbers <- triangle_matches[1,c("p1", "p2", "p3")] %>% st_drop_geometry()
as.data.frame(cart2bary(
  X = simplex_test,
  P = point_test)) %>% 
  setNames(node_numbers)

My function for this is a needlessly confusing mess ATM

btupper commented 1 month ago

Oh, nice!

Let's build on what you have and consider adding a new file in the R directory with perhaps the following functions.

cart_to_bary = function(cart, mesh)

bary_to_cart = function(bary)

interpolate_mesh = function(cart, mesh, method = "", return_form = c("bary", "cart")[2])

The casual user would only probably call interpolate_mesh() which by default returns the interpolate in cartesian coords. What do you think of that?

Do you have a fork of the repos?