richfitz / remake

Make-like declarative workflows in R
Other
340 stars 32 forks source link

error calling function #74

Closed lukemac1 closed 8 years ago

lukemac1 commented 8 years ago

My apologies if this is an error in my programming, but I am new to using remake and am hopeful it will be a useful tool for me. I've created a simple makefile to run a function:

packages:
  - sp
  - rgeos
  - rgdal
  - ineq
  - raster

sources:
  - functions/gini.calc.R

targets:
  all:
    depends: outputs/gini-coefficients-all-land-by-Bioregion(Omernik).csv

  ownfix: 
    command: load(file = "d:/BoxSync/Ownership-Analysis/data/Clean.Ownership.Map-gBuffer.rdata")

  bioregions:
    command: readOGR(dsn = I("D:/BoxSync/Ownership-Analysis/data/bioregions"), layer= I("CA_Omernik(NAD83)")) 

  gini_list_bioregions:
    command: gini.calc.bioregion.LEVEL3_NAM(bioregions, ownfix)

  outputs/gini-coefficients-all-land-by-Bioregion(Omernik).csv:
    command: write.csv(gini_list_bioregions, file = "outputs/gini-coefficients-all-land-by-Bioregion(Omernik).csv")

It seems to be working fine up until the function gini.calc.bioregion.LEVEL3_NAM(), where I get an error:

> remake::make()
[  READ ]                                                              |  # loading sources
<  MAKE > all
[    OK ] bioregions
[    OK ] ownfix
[ BUILD ] gini_list_bioregions                                         |  gini_list_bioregions <- gini.calc.bioregion.LE...
[  READ ]                                                              |  # loading packages

 Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘extent’ for signature ‘"character"’ 

That function calls on a data column name from a spatialpolygonsdataframe titled "LEVEL3_NAM" (see below), which I think is causing my problem. I have currently put the column name into the function, but previously had it as a character parameter in the function in quotation marks. The function works both ways in regular R, but it seems to be having a problem when running through remake::make() Both ways returns this error. The function is here:

gini.calc.bioregion.LEVEL3_NAM <- function(regionboundaries, ownership){ 
  #regionboundaries is a spatialpolygons file associated with regionboundaries
  #ownership is spatial data that is used in assessing the gini calculation

  regionnames <- unique(regionboundaries@data[,"LEVEL3_NAM"]) #I believe this may be causing the problem as well as further below

  gini.list <- as.data.frame(matrix(0, ncol = 2, nrow = length(regionnames)))
  gini.list[,1]<-regionnames
  row.names(gini.list) <- regionnames
  colnames(gini.list) <- c("Region","Gini.Coeff")

  #running through all the regions found in the data column titled "LEVEL3_NAM"
  for (i in regionnames){
    regionbound <- regionboundaries[regionboundaries@data[,"LEVEL3_NAM"] == i,]
    inter <- intersect(regionbound, ownership)
    inter.area.list <- gArea(inter, byid=TRUE)
    gini.list[i,2]  <- ineq(inter.area.list, type = "Gini")
  } 
  return(gini.list)
}

Any suggestions? I was thinking about having the function call on the column number of interest and avoid the use of calling by character column name since it seems like the error is taking issue with that. Maybe I'll try that tomorrow.

Maybe this is muddying the waters, but I also found other strange problems like it throwing this error if I used file.path() in my readOGR call...

Error in FUN(X[[i]], ...) : While processing target 'bioregions':
    Unknown special function file.path

I worked around it in that case by using I("D:/filpathhere"), which I saw in another issue, but am a bit unclear on why it's not translating the file.path() function over, but it will use the file path from load() without a problem.

richfitz commented 8 years ago

I think your problem might possibly be the line:

  ownfix: 
    command: load(file = "d:/BoxSync/Ownership-Analysis/data/Clean.Ownership.Map-gBuffer.rdata")

which will return a character vector of the names of the things stored in the rdata file and as a side effect load the object. When called in future sessions, the objects will not be loaded.

So when calling

  gini_list_bioregions:
    command: gini.calc.bioregion.LEVEL3_NAM(bioregions, ownfix)

the ownfix here is a character vector.

You can see what remake thinks is going on by running:

bioregions <- remake::fetch("bioregions")
ownfix <- remake::fetch("ownfix")

or equivalently:

ownfix <- remake::make("ownfix")

So, I'd define a new function:

get_from_rdata <- function(filename) {
  v <- load(filename)
  if (length(v) != 1L) {
    stop("Expected exactly one object")
  }
  get(v)
}

and have

  ownfix: 
    command: get_from_rdata("d:/BoxSync/Ownership-Analysis/data/Clean.Ownership.Map-gBuffer.rdata")

(However, I would definitely not have an absolute path here as this will work on noone's computer other than yours and that's one of the aims of remake. For very large files as you get in GIS though I'm really not sure what the best option is but I am thinking about it).

Your other issue is unrelated:

Error in FUN(X[[i]], ...) : While processing target 'bioregions':
    Unknown special function file.path

What this means is that you have a target bioregions where the command contains the function file.path; remake is refusing to process that rule because it's too hard to reason about. The reason why the error message is a bit cryptic is that there are some "special" functions that are allowed (I() can be used to force arguments that are strings to be interpreted as strings rather than filenames). The issue here is that remake does a lot of reasoning about what sort of things arguments to commands are likely to be. Logic, like your file.path use, should go into the functions that you source. This does tend to proliferate a lot of small functions but it's the only way I can see to clearly indicate intent.

lukemac1 commented 8 years ago

Nice! It looks like it's working. Thanks! This seems like a great tool and I look forward to getting more proficient with it.