Closed tim-salabim closed 4 years ago
I've been calling it the sfverse - well trying not to call it that as I know the risks of pre-emptive labelling:
This era (which we refrain from labelling the sfverse with any sersiousness, awaiting a better name!) clearly has the wind in its sails and is set to dominate future developments in R's spatial ecosystem for years to come.
But I think sfverse
is better than your suggestions - too close to pverse
, which some people think this idea is! sverse
or geoverse
are also good options I think.
Reflecting on the options I've seen/considered I think geoverse
would be the best at communicating the intentions of such a move if it were to happen.
I don't see why the word should end in -verse, we're not subordinate to another verse are we?
Also, spatial (space) is wider than geo (Earth) - lots of spatial data is not Earth-bound.
True but it could be an useful HT (tip of the hat, but in a non macho, pro collaborative way) to the tidyverse which clearly inspired the idea.
I suspect people wishing to map the surface of Pluto would not be put off by the word geo if they are not put-off by the word Geographical in QGIS, as seems to be the case for at least one person.
But I do see that there are spatial applications, e.g. cell biology which are not geographical. spatialverse
would resolve that issue and be pretty unambiguous. BTW rapid prototyping @tim-salabim, nice!
@Robinlovelace as I say above, this was done a few months ago. In any way, I think it is more important to think about content first than trying to find a suitable name.
Although I recently watched Donald Knuth's keynote speech at UseR who spoke of the importance of names. Discussing names can help clarify what the aim is, although I can see that it risks bikeshedding.
Thanks to Edzer for introducing me to that concept - only just discovered it has an interesting etymology!
Another dimension is the external lib dependencies (proj4, GDAL, GEOS, etc.). I've updated the first comment accordingly (bullet-point 3.)
Since you mentioned docker in bullet point 3, there is also rocker/geospatial. And thanks for starting this issue!
Heads-up: I've just registered https://github.com/geoverse before anyone (non R-y) does in the event that it's useful. Happy to pass-over ownership of this over to someone currently in r-spatial. I think
install.packages("geoverse")
library(geoverse)
to load a selection of R packages that work together nicely (e.g. without dplyr::select
boshing raster::select
) would be really useful.
@tim-salabim I believe you created r-spatial. 1 option would be to transfer stuff to geoverse - I have no issue with passing over the keys :key:
I suggest to first create these packages that work together nicely; see also https://github.com/tidyverse/tidyr/issues/360 .
Why do we need another organisation repo for a geoverse package?
We probably don't - you could just have r-spatial/geoverse
I guess. Just a suggestion in the hope it would be useful. Agree with Edzer that the priority should be getting stuff that worked. Hopefully it will be simpler and therefore less prone to issues than the tidyverse but that will be quite a mission.
What's in a name? A lot actually, but seeing as things have gravitated here I'm now thinking that it will could cause more confusion that clarity to create a new org so I take back my proposal - orgs are easy to delete! On that note, what is the procedure by which pkgs go here? Could be useful to have an 'onboarding' procedure. The current pkgs certainly look like some of the building blocks of a geoverse.
I will not start thinking about an "onboarding" procedure before we have received five non-trivial requests for packages wanting to join the r-spatial org.
Note that rOpenSci has also just expanded their scope to explicitly include geospatial packages. Competing for packages shouldn't be a goal, so it would be worth thinking about how to harmonize with them if there are packages that rightly belong in rOpenSci and the geo/spatialverse (or whatever it ends up being called).
@ateucher being active in both communities, what would you suggest as of when a package rightly belongs in one or the other?
@edzer it's hard to say. rOpenSci's scope is definitely much wider - they aren't trying to create a 'verse in that their packages aren't explicitly designed to work together for a single purpose, whereas the tidyverse packages are. Is this the goal of the spatialverse as well? I.e., is it a suite of spatial packages that are designed to work really well together? Or is it a suite of packages that constitute the basic framework for spatial analysis regardless of style (E.g., would raster and sf coexist in the spatialverse, or would we be waiting for stars)?
I don't think that answers your question, but it's hard to say until we know what the spatialverse really represents...
Thanks! You're right, we're looking into the stars...
Here's a request for a pkg: a geoverse that would include, in the 1st instance, raster, and sf. I imagine stars would get added to that, as would plotting pkgs such as mapview, tmap and leaflet.
Sound like a plan? Should be a relatively lightweight pkg that could build on the experience of tidyverse, the core script of which seems to be this: https://github.com/tidyverse/tidyverse/blob/master/R/attach.R
Not sure if that is trivial or not but I'd be up for helping. @edzer would you recommend waiting to see how things pan out with stars and other things before proceeding with this and to everyone, do you think such a pkg would be best placed here or elsewhere?
Great to talk about things before doing them and think there is latent demand for this geoverse idea. If done well I think it could make methods for handling and plotting a variety of spatial data forms more accessible to more R users, an aim I'm sure we all share.
I would like to see it before I comment, so please go ahead, develop a package, and share with everyone why you think it is useful. Reasons why I would not do this now are (i) raster and sf don't work together, (ii) by calling something xxverse, what exactly would users reasonably expect, and (iii) is life right really so complicated that users need such a package? I always forget what is in dplyr and what is in tidyr, there tidyverse helps, but is this the case for spatial?
Will users expect tidyverse-like behavior? For sf this can work to some extent, but for array data the underlying assumption that data consists of a simple sequence of records does not hold. @mdsumner 's tidync might have solutions. Raster's select and dplyr's select are incompatible.
Thanks for the comments and agree it would probably be worth waiting until sf and raster work together before doing it. The fact that there is a name clash between the select
function is precisely the kind of issue that I would hope such a metapackage could resolve.
In answer to the 3 points: (i) point taken - any news on latest thinking on this welcome, or where to keep an eye on developments in this area? (ii) users can expect the Earth (or the stars ; ) but I think reasonable expectations would include consistency between and clarity about functions (e.g. decide between raster::crs
and sf::st_csr
- I'd favour the latter), zero name clashes (or at least a way of dealing with and warning users about them consistently) and an effort to document how the different pkgs in the 'verse' can be used together; iii) I don't think it's a matter of need so much as accessibility and user friendliness and trying to make life easy for people who are not experts in programming and namespace memorisation.
I find the pipe a really useful feature of the tidyverse and, as illustrated in these slides that @Nowosad and I put together, they can help improve readability: http://robinlovelace.net/presentations/spatial-tidyverse.html#1
My default plan is to hold fire, wait for further info/feedback, if people agree would be useful, could hack together such a package and 'submit' it here. Of course the proposal may get shredded at that stage which is fine but the hope would be that in the shredding process improvements or an alternative to the proposed pkg solution will be found.
For reference, I just listened to the quarterly ROpenSci call where they talked about editor work and code reviews. One side-topic was related to their scope so I asked how narrow/wide their geospatial scope is. @sckott was so kind to point out a few of their key foci which he summed up here (3rd page, second last bullet).
There he also mentions a blog post with more detail on the geospatial efforts they undertook(-take).
Other points I gathered from the answer provided in the call:
One question I had was about naming prospective geoverse packages. I am considering creating an sf extension package that calculates polygon shape metrics. Should the package have sf
in the name similar to ggplot
extensions? Should functions have the st_
prefix?
I am not dogmatic when it comes to package names, but both make some sense, yes.
Out of interest: @tim-salabim would you also be interested in creating such a pkg later down the line? Asking as you started this thread and seem to have a pretty clear vision of what it would be like. The diagram at the top of this thread is really useful for visioning the advantages of such a pkg and I'd be happy to contribute to such a vision/pkg as an alternative to hacking one together myself. On that note I've plenty of work to be getting with rather than gazing at the stars, transitioning stplanr to support sf: 12 down 9 to go!
At some stage, yes. Though currently I feel that we're not ready for such a thing. breaking it down into the two major data models, I think vector focused things are coming along nicely, but I feel that there is still a void regarding raster focused analysis tools (other than the raster package) - not talking about stars here, rather from my vis perspective (e.g. performant raster rendering is still too immature). One vision I had when calling for r-spatial was to get people like @bhaskarvk and @timelyportfolio on board to pitch in their expertise to extend the dev possibilities of us geo-folk. I think this has worked out ok so far. Yet, focussed efforts towards proper and stable tools is hard with a bunch of part-time hackers.
Personally, I am very curious about what the stars project will evolve into as the full workflow from input to output/vis is in the scope right from the beginning and it is (i guess) partly embedded in a larger research project with quite some funding. @edzer correct me if I'm wrong...
In a nutshell, I am hesitant to think about a meta-package before we have a solid block of modules to constitute such a thing. Tieing together loose ends is fine, but they should be ends, not unfinished "somewhere-in-the-middles".
In my opinion, things have progressed substantially since the last activity here (e.g. stars is way more mature, raster updates, terra on its way, all the recent developments by @SymbolixAU, ... the list goes on I'm sure). Still, I cannot clearly envision a meta-package that would bundle all these developments neatly and take away the burden of having a good overview of what is available from the user.
I feel we should leave this open, as the discussion here is very informative and a solid base to build upon. Who knows, maybe there will be some spark that leads to something substantial someday...
As a hint, one could think about releasing a geoverse
package bundling certain core infrastructure packages such as
via the "Depends" section of a DESCR file like we do in the {mlr3verse} package.
Thanks for the hint! Technically realizing this is IMO the smallest problem, but this "verse" would only be meaningful if there were agreement on which packages go in, and which do not (your "etc."): raster? terra? RStoolbox, which builds on sp and rgdal? For this to answer we need to identify
My suggestion is to start working on these questions before doing the package that suggests having the answer.
Raster and vector are the two most common data types so I think packages that handle those, work well together and are future proof would be key. It's still not crystal clear to me which raster processing package is most future-proof and compatible with sf, which I would see as the cornerstone of such a metapackage. Couple of questions about specific packages:
sf::gdal_utils()
has some raster processing capabilities and I think the GDAL components of sf
overall supercede rgdal
. What can rgdal
do that sf
cannot?raster
is still the most feature complete and stable so my starting point would tentatively be just sf
and raster
, see how it goes and add additional dependencies later down the line.stars
?My suggestion is to start working on these questions before doing the package that suggests having the answer.
Definitely.
My thoughts:
Only use the common ones everybody needs for spatial tasks into DEPENDS (because they are auto-attached)
the extended core could list packages which should be installed when calling install.packages("geoverse")
but are not automatically attached
I am somewhat concerned about packages with a large dependency chain such as {RStoolbox} or {sentinel2r}. When having such in IMPORTS, installation will also install all recursive dep. We can discuss which pkg still qualifies and which not but I just want to raise awareness here.
Such pkgs could go into "SUGGESTS" since they are not automatically installed when calling install.packages()
but only for install.packages(dependencies = TRUE)
.
On a side note, I'd like to mention that CRAN accepted the {mlr3verse} package with the label "expection". Therefore idk if a {geoverse} or {spatialverse} will be accepted. This might trigger more fields to create a wrapper package just for loading. However, if there is one "exception", everybody should be allowed to do so. And there is also the {tidyverse} pkg.
I'd say reducing duplication of functionality between geoverse
packages should be a priority.
I'd say reducing duplication of functionality between geoverse packages should be a priority.
This is always welcome but a completely different point here.
I'd say if there is an overall interest to do this then
@edzer @tim-salabim @rsbivand
could maybe take the lead in narrowing down a list for the three sections of the DESCR file outlined above. To not loose overview, a gdoc or similar might be a better place to this than a GH issue.
One point to consider here is that, at least in my understanding, tidyverse is to some degree aimed at novices coming to R. Though not oficially stated, many design choices would suggest so. With this in mind I think the scope of such a meta package is much clearer than what we are envisioning here so far. I don't think we can serve everyones interest with one-package-to-rule-them-all.
With this in mind I think the scope of such a meta package is much clearer than what we are envisioning here so far. I don't think we can serve everyones interest with one-package-to-rule-them-all.
That is why I am unsure the package will be accepted by CRAN. Nevertheless it could be at least live on GH. For {mlr3} we have a package suite that works together and even needs to be loaded in a combined way to actually use the functionality of single pkgs. Pkgs in the {geoverse} are actually standalone and do not really follow an overarching design philosophy.
sf
and stars
are not standalone and do follow an overarching design, and so do sp
, rgdal
and (to some extent) raster
.
@jl5000 This is completely off-topic and has been discussed at several places already. If you can't find any, feel free to open an issue in the {mlr3} repo. (Please mark these comments as off-topic)
An interesting and also outside the R-world again and again heard discussion and for sure it is beyond my R-capabilities but nevertheless some thoughts...
Of course there are some strong arguments for a streamlined architecture (which maybe finally leads to the click button "solve me"...)
According to my experience and opinion, a large part of the mentioned problems and confusion results from the professional and technical ignorance of the users (not only the R-users!) with regard to spatio-temporal concepts as well as poor knowledge of possible, reasonable and resilient approaches to solve them. A "verse" approach like tidyverse is good and simple for established workflows but will not solve this shortcomings.
In fact, I see no reason to create a geo/spatio/whatever verse because even a lot of the underlying libraries, GI software packages and spatio-temporal concepts are not homogenized or comparable. The multitude of competing and established algorithms are not even mentioned here. Partly the above argumentation is struggling around that point why and what to integrate.
From my point of view they have not been and will not be homogenized (and thus integrated into a "verse") because beyond vanities there are a lot of good reasons to have, know and use different and competing concepts.
And for this it needs experience and knowledge.
If we want to support users then through comparative tutorial workshops learning and teaching offers that take care of the technical implementation of conceptual and scientific solutions I believe approaches like Geocumputation with R are more sustainable than a streamlined meta package
My take is smaller packages means more flexibility and power. The monoliths do so much but at some point don't work and you are endlessly plumbing around their assumptions (though yes they work for most). Focus on tiny packages that do one thing, so others can choose for themselves.
I think in summary there are too many subgroups / a too complicated which packages would belong to the r-spatial "core" and who makes the decisions/maintains such a package in the long run.
Probably too much overhead for the gain in the end? I'll close here for now, feel free to re-open if more discussion is wanted.
This idea has come up several times, most notably in https://github.com/r-spatial/rspatial_spark/issues/5 by @jhollist and in https://github.com/r-spatial/discuss/issues/11#issuecomment-310889249 by @Robinlovelace.
I think it's worth a separate issue, so here we go.
The idea is simple (and quite intriguing), to have an equivalent to the
tidyverse
meta-package for spatial analyses needs. Assuming we would consider creating such a meta-package think the most obvious and important question is:I have made an earlier attempt in visualising this (heavily adopted from the tidyverse)
If we can answer this (i.e. if we can somehow agree on a core set of packages for spatial workflow) then there are obviously other questions that arise:
spatial
enough as a common denominator?I leave this here, as I think many more questions will arise but it is only feasible to think about those if there is any solid plan to introduce a
spverse
,spidyverse
,...