PIFSC-Protected-Species-Division / LTabundR

R package for design-based line-transect density estimation
https://pifsc-protected-species-division.github.io/LTabundR-vignette/
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Handling missing covariates for detection function fitting #11

Closed ericmkeen closed 1 week ago

ericmkeen commented 3 months ago

How to handle missing values in numeric covariates, such as Bft?

Currently LTabundR implements an approach in which sightings with missing covariate data are removed from detection function fitting (only if the relevant covariate is specified as a candidate). Then, in abundance estimation, we estimate the ESW for those sightings using the average ESW from the region-year for which density is being estimated.

Is this approach acceptable?

amandalbradford commented 3 months ago

While we generally don't want to exclude sightings from the focal survey from detection function estimation, if sightings are just being used for detection function estimation and are missing covariate date, we could just remove them. If we need to retain them for some reason (e.g., from focal survey, although I suspect this would be very uncommon), what about trying to address/interpolate missing covariate values instead of using the average ESW?

ericmkeen commented 1 week ago

As noted in this thread here, our new approach to such questions is: never fill in data, but make is easier for users to find and address gaps in data.

Therefore any sightings with missing data in covariate columns that are going to be referenced in detection-function-fitting will be removed from the dataset (both for detection function fitting and abundance estimation). These changes occur within the code for lta().

Such rows with missing covariates will be flagged in the new function, lta_checks(), so that these issues can be fixed if desired before running lta() or similar functions.

Change has been applied, working on lta_checks() and QAQC now.