zoonproject / zoon

The zoon R package
Other
61 stars 13 forks source link

copying old attributes breaks raster processing #357

Open goldingn opened 7 years ago

goldingn commented 7 years ago

I just spotted this in MESSMask whilst working on the paper.

In process modules that act on rasters the attributes of the raster are captured at the beginning of the module then replaced at the end of the module. Unfortunately some of the processes applied to the raster are stored in the attributes, such as which cells are NA. These changes are therefore undone before returning the module

MRE

```r # simple workflow work1 <- workflow(occurrence = UKAnophelesPlumbeus, covariate = UKBioclim, process = Background(n = 70), model = LogisticRegression, output = NoOutput) # current MESSMask LoadModule('MESSMask') # version that doesn't replace attributes MESSMask2 <- function (.data) { zoon:::GetPackage('dismo') # Dont allow attributes to be dropped org_attrs <- attributes(.data$ras) # Ignore warnings because they're just a bunch of NA issues suppressWarnings( mess <- dismo::mess(.data$ras, v = .data$df[, attr(.data$df, 'covCols')], full = FALSE) ) # Extreme values become NA values(mess)[values(mess) < 0] <- NA .data$ras <- mask(.data$ras, mess) # Add original attributes # attributes(.data$ras) <- c(attributes(.data$ras), org_attrs) return (.data) } # run both ras <- MESSMask(work1$process.output[[1]])$ras ras2 <- MESSMask2(work1$process.output[[1]])$ras # plot both par(mfrow = c(1, 2)) plot(ras[[1]]) plot(ras2[[1]]) ``` ![image](https://cloud.githubusercontent.com/assets/4450731/21297920/ac25835c-c5dc-11e6-857c-a3d24aca90f6.png)

I know we went through a couple of design options for making sure attributes are passed on, but I can't remember what they were.

Could we instead have a list of allowed attributes, where only these are enforced? We could store that in a helper function for module developers to make this clearer. E.g. listing what's allowed:

ZoonAttributes()
[1] "some attribute" "another allowed attribute" "etc."

getting the attributes

temp_attributes <- ZoonAttributes(.data)

and setting them

ZoonAttributes(.data) <- temp_attributes

what do you think @AugustT?

AugustT commented 7 years ago

I changed this feature in MESSmask specifically because the presence of the na.omit attribute caused errors further down the line, I didn't realise removing the attribute 'undid' the setting of NAs! Good to know! I can't now remember exactly what the problem was but it was something to do with the na attribute suggesting there was missing data when there wasn't , perhaps after a Chain? Sorry I cant be more help

AugustT commented 7 years ago

Seeing as I can't remember the original issue your solution sounds fine. But just watch out for times where the na attibute might cause problems.

goldingn commented 7 years ago

OK cool, if there was a specific MESSMask issue, I'll just see if I can fix that for now.

goldingn commented 7 years ago

This is fixed for MESSMask now. The solution I went with was to only put back into the attributes those which were missing after processing (in this case the call path):

  # replace any attributes lost during processing
  end_attrs <- attributes(.data$ras)
  which_missing <- which(!names(org_attrs) %in% names(end_attrs))
  attributes(.data$ras) <- c(end_attrs, org_attrs[which_missing])