r-spatial / discuss

a discussion repository: raise issues, or contribute!
54 stars 12 forks source link

FOSS4G 2022 r-spatial workshops ? #53

Open Bakaniko opened 2 years ago

Bakaniko commented 2 years ago

Hi,

Next FOSS4G will be held in Florence (Italia) in August 22-28, 2022.

In 2019, I set up a r-spatial workshop with teaching materials I already have. And there was some success. There wasn't any R related stuff in FOSS4G in years.

Last year there was a r-spatial panel that went great.

The call for paper for workshops is open (until ?). The workshops are 2 or 4 hours longs.

I think we can propose 2 :

What do you think ? Who plan to attend the conference ?

rsbivand commented 2 years ago

What is the format and proposal deadline (looks like 28 February)? Could we think of a 4-hour intro and 2 2-hour advanced (one as you suggest, another on links to OSGeo libraries or evolution generally or R/GRASS or QGIS processing or migration away from retiring packages)?

@edzer and I were considering being there to work on evolving the ecosystem (the less brutal expression for retiring rgdal/rgeos/maptools during 2023), and this of course is linked to evolving the R/GRASS interface https://github.com/rsbivand/rgrass/issues/34 and other issues on that repo.

Bakaniko commented 2 years ago

Yes, submission deadline has been set to the 28/02/2022.

Format is pretty open, materials can be provided in all sorts of ways (website, github repos). I think there will be breaks every 2 hours (I remember the 2019 workshops to be 3 hours long with a middle break).

I'm ok to propose an intro workshop (following r-spatial guidelines) but I don't think I can do advanced ones (I'll be happy to learn more about links to OSGeo libraries) so I'll leave it to more experienced people.

In 2019, I gathered the materials in a bookdown repo: https://github.com/Bakaniko/FOSS4G2019_Geoprocessing_with_R_workshop It can be, of course, largely improved.

Where do you think we should materials ? under r-spatial organisation ? Elsewhere ?

edzer commented 2 years ago

I'm also planning to attend FOSS4G. Workshops seem to be Aug 22+23.

rsbivand commented 2 years ago

A possible topic for a 2-hour workshop could be "Modernizing the R-GRASS interface: confronting barn-raised OSGeo libraries and the evolving R.*spatial package ecosystem" with an abstract something like:

"Introduces the current status of the rgrass package, after reviewing over 20 years from GRASS for pre-6 GRASS, spgrass6 for GRASS 6, rgrass7 for GRASS 7, and rgrass for GRASS 7.8 and subsequent versions. There are two modes of operation, starting R in a running GRASS session in an existing GRASS location, or starting a fresh GRASS session from R. This GRASS within R session may use an existing location, or instantiate a temporary location to make GRASS analytical tools available to R users from the R prompt. All GRASS commands and most GRASS extensions can be scripted from R, because the flags and parameters are published by each such command in XML, and are cached in the R interface package.

Since the legacy rgdal package used in spgrass6 and rgrass7 for making intermediate copies of spatial data being moved between the environments is being retired in 2023, rgrass needs to step away from this dependency, and find an alternative also providing GDAL file input-output for raster and vector data. Consequently, to understand the development path of the R-GRASS interface, some description of the package ecosystem is also needed."

@veroandreo does this seem viable? Any suggestions welcome!

veroandreo commented 2 years ago

A possible topic for a 2-hour workshop could be "Modernizing the R-GRASS interface: confronting barn-raised OSGeo libraries and the evolving R.*spatial package ecosystem" with an abstract something like:

"Introduces the current status of the rgrass package, after reviewing over 20 years from GRASS for pre-6 GRASS, spgrass6 for GRASS 6, rgrass7 for GRASS 7, and rgrass for GRASS 7.8 and subsequent versions. There are two modes of operation, starting R in a running GRASS session in an existing GRASS location, or starting a fresh GRASS session from R. This GRASS within R session may use an existing location, or instantiate a tempory location to make GRASS analytical tools available to R users from the R prompt. All GRASS commands and most GRASS extensions can be scripted from R, because the flags and parameters are published by each such command in XML, and are cached in the R interface package.

Since the legacy rgdal package used in spgrass6 and rgrass7 for making intermediate copies of spatial data being moved between the environments is being retired in 2023, rgrass needs to step away from this dependency, and find an alternative also providing GDAL file input-output for raster and vector data. Consequently, to understand the development path of the R-GRASS interface, some description of the package ecosystem is also needed."

@veroandreo does this seem viable? Any suggestions welcome!

I like the idea and the proposal! Since it's a workshop proposal, I'd add some examples of the exercises or demos that will be shown. For example: "We'll show different ways to connect R and GRASS, and different ways to read and write esp. raster data via x, y and z, their advantages and disadvantages" (or so... I'm thinking on your new dev branch including terra and stars here).

rsbivand commented 2 years ago

Thanks! Suggestions very sensible - as things stand for https://github.com/rsbivand/rgrass/issues/42 it looks as though Suggests: terra is easiest, with coercion to and from "SpatVector" and "SpatRaster" up to the user; "SpatialGridDataFrame" stays as a special case for raster because the use of r.in.bin and r.out.bin is established.

Would you recommend OSGeoLive, or rather ask participants to pre-install GRASS, R and R packages? What is your experience?

veroandreo commented 2 years ago

Thanks! Suggestions very sensible - as things stand for rsbivand/rgrass#42 it looks as though Suggests: terra is easiest, with coercion to and from "SpatVector" and "SpatRaster" up to the user; "SpatialGridDataFrame" stays as a special case for raster because the use of r.in.bin and r.out.bin is established.

Would you recommend OSGeoLive, or rather ask participants to pre-install GRASS, R and R packages? What is your experience?

Dealing with participants' various systems, versions and configurations is always painful. In some in person FOSS4Gs in the past, OSGeo used to provide bootable OSGeo-live sticks with latest stable versions of all OSGeo software. This edition of FOSS4G is supposed to be hybrid, so I'm not sure how they'll handle that. In the last year online FOSS4G, some workshop trainers sent us info to fwd to participants with instructions and/or links from where to download VM or data. If, when submitting the proposal, you state minimum needed versions of GRASS, R and R packages to be installed before hand, it should be fine I believe. Maybe, even include a simple test so participants make sure all work in advance (start grass, open r within the terminal, run library(rgrass) or so). Most of the problems I've had while teaching GRASS were moreover related to windows, python versions, and installing python external dependencies, rather than the R part. Make sure though, esp. for the windows case, that the connection with R works both with the standalone and OSGeo4W installers.

rsbivand commented 2 years ago

Short workshop proposal submitted.

Bakaniko commented 2 years ago

This is a proposal for a 4 hours workshop aimed to beginners

title

duration: 4 hours

Abstract (10-50 words)

R is an open source language and environment for statistical computing and a major player in the field of data science. The R spatial ecosystem is very rich and provides tried and tested algorithm to work with vector or raster data. This workshop will bring you the basics of R use and key points to handle spatial data within R.

Description (50 - 300 words)

This workshop is aimed to beginners. It will start with basics of the language and the Tidyverse grammar. The first part will be on dataframes and data handling. Then we will concentrate on the spatial libraries. How to load and do common spatial operation (spatial filteriong, intersection, spatial joining and geometry operations (buffers, centroids,), how to export and reproject vector data. Then we will produces maps with tmap.

The third part will be on raster data related libraries (stars ? terra ?).

@Nowosad can you complete please ?

rsbivand commented 2 years ago

Looks good! Just as a minor point, terra does not I think suit tidyverse, so avoiding tidyverse verbs and pipes will be less confusing for those wishing to use both sf /stars and terra. I'd go with 1 hour R-base (briefly explaining S3/S4 classes as sf/stars are S3 and terra is S4), then 3 hours R.spatiial (where `.` is sf/terra inclusive). As terra is S4, the set of argument classes matters, and assuming that only the first argument class controls despatch (as in tidyverse) is confusing.

Nowosad commented 2 years ago

Hi @rsbivand -- thanks for the comments. I agree that we should avoid using pipes for that kind of workshop.

Nowosad commented 2 years ago

@Bakaniko my edits:

Title

Duration: 4 hours

Abstract (10-50 words)

R is an open-source language and a major player in the field of data science. The R spatial ecosystem is very rich and provides tried and tested tools for working with spatial vector or raster data. This workshop will introduce R and its package for handling spatial data.

Description (50 - 300 words)

This workshop is aimed at beginners. The first part will focus on data frames (table-like objects) and how to handle tabular data in R. We will start with the basics of the R language and the grammar of a family of additional R packages called tidyverse.

The second part of the workshop will concentrate on the spatial analysis of vector and raster data. It will start by showing how to create maps with the tmap package.

Then, several workflows related to spatial vector data will be discussed: how to read spatial vectors, perform common spatial operations (spatial filtering, intersections, spatial joining, and geometry operations - buffers, centroids), how to reproject vector data, and how to export it. Next, we explain an ecosystem of packages related to raster data handling and provide a few examples of spatial raster operations.

Finally, pointers to additional materials will ensure that participants know where to get help and how to take confident next steps after the workshop.

edzer commented 2 years ago

I will propose something along these lines; happy to receive comments!

title

Analyzing large amounts of imagery with openEO, openEO Platform, R, stars, and gdalcubes.

duration: 2 hours

Abstract (10-50 words)

We present a number of novel open source software components for cloud-based analysis of large amounts of satellite imagery. In Particular, the openEO API and processes, their implementation in openEO Platform, and several relevant R components (including packages stars, openeo, and gdalcubes) are discussed and demonstrated.

Description (50 - 300 words)

Very large imagery datasets, such as those arising from the Copernicus or Landsat programs, are often easier analysed in the cloud than locally. Besides big tech backed platforms such as Google Earth Engine and Microsoft Planetary Computer, an independent and federated solution is growing around the open source openEO API and software ecosystem. We will discuss and demonstrate several components, including the now publicly available openEO Platform, the openeo (client) software, the ability to run user-defined functions on the platform(s), as well as R packages stars for handling and analyzing vector and raster datacubes in R, and gdalcubes for creating datacubes from image collections, and the role of STAC (the spatiotemporal asset catalogue) in these efforts.

rsbivand commented 2 years ago

For the record, my proposal is:

Title: Modernizing the R-GRASS interface: confronting barn-raised OSGeo libraries and the evolving R.*spatial package ecosystem

Abstract: Introduces the current status of the R R-GRASS GISS interface package "rgrass". There are two operation modes, starting R in a running GRASS session in an existing GRASS location, or starting a fresh GRASS session from R. All GRASS commands and most GRASS extensions can be scripted from R.

Description: We'll show different ways to connect R and GRASS, and different ways to read and write especially raster data via chosen R packages , their advantages and disadvantages. There have been many changes in the OSGeo library ecosystem used both by GRASS GIS and R.spatial packages (https://r-spatial.org, https://rspatial.org), so the tutorial will introduce the interface to those for whom it is new, and update users of the interface on ongoing changes related to changes both in the OSGeo library ecosystem and the R.spatial package ecosystem.

Examples will include using GRASS GIS functionality to extend what is feasible in R alone when handling and analysing spatial data, and using R as an alternative to Python for scripting GRASS GIS commands.

Information on the versions of GRASS, R and R packages which should be installed on participants' computers will be provided, together with a simple test to check that installation has succeeded. Data sets will also be provided.

Language: en

Main topics of the workshop: Desktop, Analysis

Level of the workshop: Intermediate

Requirements for the Attendees: Some knowledge of R and GRASS will be helpful as in a short format it is not possible to start from the beginnings.

Bakaniko commented 2 years ago

Hi,

Thanks @Nowosad and @rsbivand for your comments. Actually I'm not very good at R base and a strong user of pipes. I also think that we should focus on one raster package so we can avoid confusion.

maybe we should focus on one ecosystem : sf + stars so we present only S3 classes and we can use the tidyverse grammar.

I plan to explain that, unlike Python, there are several paradigms (R base, tidyverse, data.table) and that they all have qualities but in order to keep the workshop coherent we will show just one. In the same way, I propose to only show tmap but explain that there is other ways to draw maps.

In the other hand, if @edzer propose a workshop with stars, we can diversify with terra.

edzer commented 2 years ago

The workshop I propose does not focus on stars, but on the problem of analyzing large image collections from an R perspective.

rsbivand commented 2 years ago

Your choice of tidyverse, I know that tidyverse limits users strongly, binding them into a slow and hard to debug syntactic shim on top of base R. For pipes, native pipes and right-assignment can be OK, but debugging magrittr pipes is hard, because you don't see the intermediate dot objects. Users need to understand data structures, I think: https://github.com/rsbivand/ban421_h19.

Bakaniko commented 2 years ago

Thanks for your comments and thanks @rsbivand for this resource, it is very rich, even for me !

I'll stick with Jakub proposal. @Nowosad what requirement should we ask ?

I propose:

I propose to recommand the installation of Rstudio too.

Maybe we can propose to use OSGeoLIve since R and some spatial libraries are included ? I can build a Virtualbox image to share based on it.

rsbivand commented 2 years ago

See the comment from @veroandreo on OSGeoLive. Note that Windows R 4.2 is UCRT, on that platform spatial packages seem to be OK (checking fine, worked with Tomas Kalibera last year), but your scripts will need checking.

Maybe also check scripts on macOS arm64, I can help with arm64 and UCRT checking. It also looks as though RStudio is going to support UCRT when R 4.2 is released. OSGeo4W is not UCRT-aware, and will risk problems soon, with Windows 11 coming soon.

luukvdmeer commented 2 years ago

FYI, we also submitted a proposal for a 2h workshop to introduce the sfnetworks package, focusing on attendees that already have at least basic knowledge of R and r-spatial, or work with spatial networks in Python and want to experience what R has to offer.