zoonproject / zoon

The zoon R package
Other
61 stars 13 forks source link

add weights and offsets to df #286

Open goldingn opened 8 years ago

goldingn commented 8 years ago

The Poisson point process (PPM) approach to presence-only SDM is likely to become increasingly popular, since it allows MaxEnt-style handling of background data in all sorts of models. I'm working on an R package ppmify to facilitate this, though the guts could easily form a standalone module.

Fitting a PPM in glm-style software requires specification of an offset argument, with different offsets applied to presence and background data. Fitting these models would therefore require an offset column to be added to df.

Similarly, likelihood weights are a useful tool for accounting for suspicious datapoints and for uncertainty in the spatial location of points. Adding a weights column to df would therefore be helpful too. The default values for these columns would be df$offset <- 0, df$weight <- 1, which is what they get set to if not defined.

Note there is the suggestion of adding columns for species (#272) and CRS (#285) columns too, and it's not immediately obvious (to me) whether these should be required columns in df, or optional columns. I'm tagging this as a question as I'd appreciate thoughts on this.

As in #285, I think it would be worthwhile making all four of these optional first, then deciding whether to enforce them in the definition of df.

AugustT commented 8 years ago

I think it would be great if we didn't have to require these columns as it takes the weight of the user a bit. Instead is a modules needs a column and it is not there it can throw an error. ie is weight column absent (error I need aweight column)

goldingn commented 8 years ago

Cool, I think that's the right approach. Also for CRS. Should we still add species though?

It might be worth spelling these auxiliary columns out clearly in the vignettes too.