datacarpentry / r-raster-vector-geospatial

Introduction to Geospatial Raster and Vector Data with R
https://datacarpentry.org/r-raster-vector-geospatial
Other
113 stars 111 forks source link

Curriculum Advisory Committee Recommendations from 2022 Q1 Meeting (Help wanted!) #368

Open srappel opened 2 years ago

srappel commented 2 years ago

Hello maintainers and fellow Carpentries community members!

The Data Carpentry Geospatial Curriculum Advisory Committee had our 1st quarter meeting on March 29th and we have posted our minutes on the Curriculum Advisors repo.

You can view the minutes here, but I've copied the relevant recommendations below. We are calling for volunteers to help implement these important changes to the lessons. I will label this issue as "Help Wanted" and look forward to your contributions!

Please feel free to reach out to me or my co-chair Jeff Hollister (@jhollist) if you have questions about these recommendations or if you would like to bring something to our attention for future meetings of the CAC. We also encourage you to reach out to the maintainers of the lessons as you develop.

  1. Transition from PROJ and proj4strings
    • Lessons should be updated to reference coordinate systems by their EPSG codes from the EPSG Geodetic Parameter Dataset instead of using proj4strings. The Committee agreed that the alternative Well-known text (WKT) representation is unwieldy and unnecessary for most common use cases. WKT should be mentioned as an alternative to EPSG codes, especially where there is no existing EPSG standard. Lessons should include examples of converting between the EPSG and WKT representations.
  2. Deprecation of the sp, rgeos, and rgdal packages
    • The Committee agreed that the references to rgdal and rgeos be removed or replaced with references to the equivalent sf and terra functions as appropriate. This decision is closely paired with the rationale for choosing terra as the replacement for the raster package and aims to avoid code-breaking deprecations coming some time in 2023.
  3. Transition from raster to terra or stars
    • The terra package appears to be the most direct replacement for raster as it uses language which is similar to raster and common to other GIS. The Committee recommends that terra be adopted as a replacement to raster. Stars should be presented as an alternative to terra that may be faster in some cases or more appropriate for analyses with longitudinal elements.
gklarenberg commented 1 year ago

Is anyone already working on the the raster to terra conversions? I can help with that (have been doing the conversions it for my own classes as well, and I am very familiar with the NEON data). And I support using terra here, not stars. The latter is good for more advanced users, but as an intro course, terra is appropriate.

jebyrnes commented 1 year ago

Happy to help! I’ve been working with terra a lot recently (and tidyterra - which would be worth weaving in as it means we can take away the painful step of having to convert to a data frame for plotting, etc. so one can use geom_spatraster() and geom_spatraster_rgb()) and really loving it. So fast! And just as easy (or easier with tidyterra https://dieghernan.github.io/tidyterra/ )

-Jarrett Byrnes


Jarrett Byrnes Associate Professor Dept. of Biology UMass Boston 401-529-4104 pronouns: he, him, his http://byrneslab.net @jebyrnes

On Feb 22, 2023 at 11:38 AM -0500, Geraldine Klarenberg @.***>, wrote: CAUTION: EXTERNAL SENDER

Is anyone already working on the the raster to terra conversions? I can help with that (have been doing the conversions it for my own classes as well, and I am very familiar with the NEON data). And I support using terra here, not stars. The latter is good for more advanced users, but as an intro course, terra is appropriate.

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdatacarpentry%2Fr-raster-vector-geospatial%2Fissues%2F368%23issuecomment-1440387170&data=05%7C01%7Cjarrett.byrnes%40umb.edu%7C96be72aeb8ce4cd0a7f008db14f334c3%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C638126807021434655%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pjGuhKvbh6wJPnddEEcR0jNuUpPyI2gYH2sKqPLJJFU%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAFY5ISRC5RO2QQPBNMUQYDWYY6HVANCNFSM5WMQIWFQ&data=05%7C01%7Cjarrett.byrnes%40umb.edu%7C96be72aeb8ce4cd0a7f008db14f334c3%7Cb97188711ee94425953c1ace1373eb38%7C0%7C0%7C638126807021434655%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dar6hcBUdcsAV3I8NkCrKuQcy69FWdpmz3RO%2BK84hYo%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

srappel commented 1 year ago

It's so exciting to watch all these changes roll in!

albhasan commented 1 year ago

Good evening,

The changes made so far addressed @srappel's issues 2 & 3. Regarding issue 1, Transition from PROJ and proj4strings , I took a look at the lesson's data and found the following:

I guess we could replace the vector data with re-projected versions and without the Z coordinates. In this way, we could start looking at issue 1 knowing the lesson's data isn't a source of PROJ strings. However, the lesson data isn't currently under version control. However, this raises a new question; could we also host the lesson's data along the lesson? Or even better, could we build an R package with the data? It wouldn't need to be on CRAN, it could be hosted in some public git repository and be installed using devtools::install_github

Bests,

Alber

#!/usr/bin/Rscript --vanilla
###############################################################################
# Check the data for the Carpentries lesson r-raster-vector-geospatial 
###############################################################################

library(dplyr)
library(purrr)
library(sf)
library(terra)
library(tibble)
library(utils)

#---- Setup ----

data_dir <- "~/Documents/github/datacarpentry/r-raster-vector-geospatial/episodes/data"

stopifnot("Data directory not found!" = dir.exists(data_dir))

#--- Utilitary ----

get_crs_wkt <- function(crs) {
    return(crs$wkt)
}

get_epsg <- function(crs) {
    return(crs$epsg)
}

get_input <- function(crs) {
    return(crs$input)
}

has_z <- function(obj_sf) {
    return(!is.null(sf::st_z_range(obj_sf)))
}

has_m <- function(obj_sf) {
    return(!is.null(sf::st_m_range(obj_sf)))
}

#---- Raster data ----

raster_tb <-
    data_dir %>%
    list.files(pattern = "*.tif$", 
           full.names = TRUE,
           recursive = TRUE) %>%
    tibble::as_tibble() %>%
    dplyr::rename(file_path = value) %>%
    dplyr::mutate(obj = purrr::map(file_path, terra::rast),
                  obj_crs = purrr::map_chr(obj, terra::crs),
                  obj_crs1 = purrr::map(obj_crs, sf::st_crs),
                  epsg = purrr::map_int(obj_crs1, get_epsg),
                  epsg = purrr::map2_chr("EPSG:", epsg, paste0),
                  new_obj = purrr::map2(obj, epsg, terra::project),
                  new_crs = purrr::map_chr(new_obj, terra::crs),
                  crs_diff = purrr::map2_dbl(obj_crs, new_crs, utils::adist))

print("NOTE: Re-projecting rasters using EPSG codes doesn't change their CRSs \
      at all.")
raster_tb %>%
    dplyr::select(obj_crs, new_crs, crs_diff) %>%
    print(n = Inf)

#---- Vector data ----

vector_tb <-
    data_dir %>%
    list.files(pattern = "*.shp$", 
           full.names = TRUE,
           recursive = TRUE) %>%
    tibble::as_tibble() %>%
    dplyr::rename(file_path = value) %>%
    dplyr::mutate(obj = purrr::map(file_path, sf::read_sf),
                  obj_crs = purrr::map(obj, sf::st_crs),
          crs_wkt = purrr::map(obj_crs, get_crs_wkt),
                  epsg = purrr::map_int(obj_crs, get_epsg),
                  has_z = purrr::map_lgl(obj, has_z),
                  has_m = purrr::map_lgl(obj, has_m),
          obj_no_z = purrr::map(obj, sf::st_zm),
          crs_input = purrr::map_chr(obj_crs, get_input))

print("NOTE: There is a vector missing EPSG code.")
vector_tb %>%
    dplyr::filter(is.na(epsg)) %>%
    dplyr::select(file_path, epsg) %>%
    dplyr::mutate(file_path = basename(file_path)) %>%
    print(n = Inf)

print("NOTE: There some vectors with Z coordinates, but all of them are 0s")
vector_tb %>%
    dplyr::filter(has_z) %>%
    dplyr::mutate(file_path = basename(file_path),
          z_range = purrr::map(obj, sf::st_z_range)) %>%
    dplyr::select(file_path, has_z, z_range) %>%
    print(n = Inf) %>%
    pull(z_range)

# Add missing EPSG by hand.
vector_tb <- 
    vector_tb %>%
    dplyr::mutate(epsg = dplyr::if_else((crs_input == "WGS 84" & is.na(epsg)), 
                    4326, epsg)) %>%
    # Re-project using EPSGs.
    dplyr::mutate(new_obj = purrr::map2(obj_no_z, epsg, sf::st_transform),
                  new_crs = purrr::map(new_obj, sf::st_crs),
          new_crs_wkt = purrr::map(new_crs, get_crs_wkt),
          crs_diff = purrr::map2_dbl(crs_wkt, new_crs_wkt, 
                         utils::adist))

print("NOTE: The CRSs' WKT change after projection using EPSG codes.")
vector_tb %>%
    dplyr::select(file_path, crs_diff)
tobyhodges commented 1 year ago

Thanks so much for the fantastic work here, @albhasan.

Regarding the versioning of the example data. The example dataset is published on FigShare, where there is the option of creating a new version of the record if and when the file change. I think the record is owned by NEON, but I would be happy to try to coordinate with them to publish a new version.

Finally, a suggestion: as you have addressed most of the points raised by the CAC, it might be best to close this issue and open a new one where the specific question of how to update the dataset can be discussed further. I'll be happy to re-post my comment there if you do.

albhasan commented 1 year ago

Hi @tobyhodges,

I'm following your suggestion and I opened #426. Can you please re-post your comment there?

Thanks,