BiologicalRecordsCentre / sparta

Species Presence/Absence R Trends Analyses
http://biologicalrecordscentre.github.io/sparta/index.html
MIT License
21 stars 24 forks source link

region_aggs = "named list()" #220

Closed drnickisaac closed 2 weeks ago

drnickisaac commented 3 years ago

I've just come across a curious case where the region_aggs object has become corrupted somehow. It's an example from the new Caddisfly (Trichoptera) runs: in fact it is the final species:

datadir <- "/data-s3/occmods/Trichoptera/occmod_outputs/2021_Ellcur"
modfiles <- list.files(datadir)
mod <- readRDS(file.path(datadir, modfiles[length(modfiles)]))
mod$regions
mod$region_aggs

You'll see from running this code that there are only two regions, "WALES" and "South.West" reflecting the narrow distribution of this species. The region_aggs object was originally defined as c("ENGLAND", "GB"), but these have been lost. This looks like an unexpected behaviour from this piece of code: https://github.com/BiologicalRecordsCentre/sparta/blob/6c8c993887523545cb13f6f878d0e7c70f149a6a/R/occDetFunc.r#L500

AugustT commented 3 years ago

Is it the 'any' on 492? Should that be 'all'? Ie remove a region aggregate if all regions are zero not if only 1 is zero?

DylanCarbone commented 4 months ago

Hi @AugustT,

Are you referencing this line:

https://github.com/BiologicalRecordsCentre/sparta/blob/2337f64266f01d8d91608a9b747b15eca1f45cc6/R/occDetFunc.r#L553C1-L555C57

I can't reproduce this error without the data, but the section does seem to remove aggregates with any 0 regions, not all

AugustT commented 4 months ago

Hi Dylan, this is so long ago I am not sure. It looks like hte code has changed a bit since the issue was raised. I also dont have access to the data to replicate the issue, but maybe you could simulate some data to replicate the conditions?

DylanCarbone commented 4 months ago

Hi both,

I can replicate the issue, and it's caused by the parameter rem_aggs_with_missing_regions, defaulting to TRUE. The function therefore removes region aggregates that contain ANY regions with no sites rather then ALL regions with no sites. I've changed to default to FALSE, and I've cleaned the code up slightly in the relevant section where the function filters region aggregates.

I will include these changes in the next pull request.

Thanks