ropensci / osmplotr

Data visualisation using OpenStreetMap objects
https://docs.ropensci.org/osmplotr
135 stars 21 forks source link

Should load sf namespace when returning sf object #46

Closed dmurdoch closed 5 years ago

dmurdoch commented 5 years ago

This thread https://stat.ethz.ch/pipermail/r-package-devel/2019q1/003473.html shows a problem: the osmplotr::extract_osm_objects() function can return an object inheriting from class "sf" even when the sf package with methods for that class is not loaded. This means that methods are not found, and (in this example) subsetting the object produces a bad result. (The full example is in https://gist.github.com/brry/7728b9b2d35afad7f1fc5978c3315009).

I can understand that you might not want to load sf unconditionally, but I believe this problem would be solved if any function that returns "sf" objects first does something like

if (!requireNamespace("sf"))
  warning("'sf' package is not available so \"sf\" methods will not work.")

If the load is successful, the methods will be defined.

mpadge commented 5 years ago

Yeah, thanks, you're right. I implemented a similar warning in osmdata, but did not propagate it through to osmplotr - shall now do!

dmurdoch commented 5 years ago

I'm not sure the warning in osmdata is working if this is on the master branch. I just started a new session, installed it, and ran

library(osmdata)
hampi_sf <- opq ("hampi india") %>%
            add_osm_feature (key="historic", value="ruins") %>%
            osmdata_sf ()
methods(class = "sf")

This produces a message that no methods are defined. However, we have

> class(hampi_sf$osm_points)
[1] "sf"         "data.frame"

so there is an "sf" object in there. If I manually run requireNamespace("sf") it returns TRUE, and afterwards there are lots of methods for the class:

> methods(class = "sf")
 [1] [             [[<-          $<-           aggregate     as.data.frame cbind         coerce        identify      initialize   
[10] merge         plot          print         rbind         show          slotsFromS3  
see '?methods' for accessing help and source code
mpadge commented 5 years ago

It's only used, and only necessary, with trim_osmdata(), and uses this function (noting that requireNamespace only confirms that a package is installed, but what we need to is to check whether it's namespace has been loaded, and that requires the code shown there). I just need to copy that over to osmplotr and employ it in appropriate place.

dmurdoch commented 5 years ago

I think you are misunderstanding the R internals. When you call requireNamespace("sf"), that checks if it's loaded and if not attempts to load it, and returns TRUE if it is loaded at the end. FALSE indicates it is not loaded and the load attempt was unsuccessful, typically because the package is not installed, but there can be other reasons for it to fail.

Your is_sf_loaded function tests something different: it tests whether the package is on the search list, i.e. loaded and attached. That means that users can call methods from sf without any sf:: prefix.

In order for class "sf" methods to work, you only need the package loaded, not attached.

It may be that trim_osmdata needs sf on the search list. I'd recommend against requiring that, but that's up to you. (You can avoid putting it on the search list and use the sf:: prefix on every function you call from it.)

However, the issue that I reported is to fix problems for other users: since osmplotr::extract_osm_objects and osmdata::osmdata_sf return objects that contain things with the "sf" class, your users should expect them to work properly, and should be warned if they won't. They'll only work properly if sf is loaded, and that's what requireNamespace("sf") attempts to achieve.

mpadge commented 5 years ago

Sorry for any confusion in my previous response. I understand your point and agree with it in principle, but what I should have mentioned was that both osmplotr and osmdata are intended to function without sf dependency (osmdata merely Suggests: sf, osmplotr not at all). This is a strict principle because an sf dependency is a huge thing, and there are always cases in which / people for whom this is too burdensome. These packages are as lightweight as possible, and so internalize their own versions of much sf functionality. (And a lot of the tests in both cases ensure ongoing compliance with current versions of sf.)

This is why there can be no requireNamespace calls, but only checks to see whether sf is loaded, because things like sf subset calls only work within other packages with sf loaded (in turn because CRAN forbids explicit namespace addressing of internal functions - no :::). That means in this case that there is nothing that can be done about this problem, nor as far as I can see it at present is there actually any way to issue an appropriate message, because the subsetting operation is an sf operation and not at all related to any direct osmplotr functionality. Does this make sense?

dmurdoch commented 5 years ago

What you say makes sense given your assumptions, but your assumptions are incorrect.

  1. Using sf methods for base functions like subsetting with [] does not require that sf is on the search list. sf just needs to be loaded, and then R will automatically dispatch to the methods.

  2. You don't check if sf is loaded, you check if it is on the search list. A check for whether it is loaded is "sf" %in% loadedNamespaces(). But using requireNamespace() does this check for you.

  3. requireNamespace() is designed specifically for the case where you have a conditional dependency on a package. Normally you would list sf in Suggests:, and then test for it before each use. But it is acceptable to continue if sf isn't present, provided you can handle that case (which you can).

So what I'm suggesting you do is to make sure sf is in Suggests:, then try to load it using requireNamespace each time you create an object inheriting from "sf". If the load fails, then give the user a warning and continue on as now.

One possibility that this doesn't handle is a user who has sf installed, but for some reason doesn't want to load it. That could also be handled, but I don't really see the point. It's a pretty dangerous approach, since all those sf methods won't be found, and the generics will do the wrong thing. But if you want to allow this behaviour, it would be possible, you'd just need to work out the API to ask for it.

brry commented 5 years ago

For completeness: see also https://github.com/r-spatial/sf/issues/970

edzer commented 5 years ago

An example of sf's conditional dependence on dplyr, using this mechanism, is found here.

mpadge commented 5 years ago

Thanks @brry, @dmurdoch, @edzer - i agree that a package that produces sf objects should do it's best to ensure that sf methods work, so now have a packageStartupMessage if !requireNamespace (sf):

#> NOTE: 'sf' package must be loaded for 'sf' methods to work