trias-project / occ-cube-alien

🗺 Occurrence cubes for non-native taxa in Belgium and Europe
MIT License
2 stars 2 forks source link

Occurrences from new zenodo cube all over the place #44

Closed SanderDevisscher closed 9 months ago

SanderDevisscher commented 10 months ago

Observations of Lithobates catesbeianus for the year 2018:

vs

When I look at be_alientaxa_cube.csv from zenodo it looks like the number of infected gridcells for Lithobates catesbeianus seems to be inflated compared to the previous version.

df <- read_csv(
  file = "https://zenodo.org/records/10058400/files/be_alientaxa_cube.csv?download=1",
  col_types = cols(
    year = col_double(),
    eea_cell_code = col_character(),
    taxonKey = col_double(),
    n = col_double(),
    min_coord_uncertainty = col_double()
  ),
  na = ""
)

test <- df %>% filter(taxonKey == 2427091)

df <- read_csv(
  file = "https://zenodo.org/records/5819028/files/be_alientaxa_cube.csv?download=1",
  col_types = cols(
    year = col_double(),
    eea_cell_code = col_character(),
    taxonKey = col_double(),
    n = col_double(),
    min_coord_uncertainty = col_double()
  ),
  na = ""
)

test <- df %>% filter(taxonKey == 2427091)

PS not certain this issue belongs here!

SanderDevisscher commented 10 months ago

Probably this issue is not limited to only Lithobates catesbeianus

qgroom commented 10 months ago

Something wrong has happened! I think is should be posted here https://github.com/trias-project/occ-cube-alien

I think this is a good example of why the cube should have a link back in each row to the occurrence ID that was actually used. The current lack of provenance makes debugging such errors much more difficult.

damianooldoni commented 10 months ago

@SanderDevisscher: thanks for reporting this. Sorry for not having time to check this properly. I will try to find time this week. Probably wednesday I will. My first thought: the random assignment algorithm used for generating cube can modify the number of cells occupied if the occs have a high coordinate uncertainty. I will also try to move the issue as suggested by @qgroom.

damianooldoni commented 10 months ago

I have only now seen the screenshots. Sorry! Actually it seems very strange, indeed. I will check it on Wednesday for sure!

SanderDevisscher commented 10 months ago

I would move the issue but i cannot 😞

damianooldoni commented 10 months ago

Yes, @SanderDevisscher. I can reproduce the unexpected behavior you encounter. It's a random assignment issue probably as it happens also for years with one occurrence only.

Notice that the problem is limited to the Belgian cube in occ-cube-alien (linked to the zenodo version you mention) and the Belgian cube produced in occ-cube repo and linked to https://zenodo.org/records/10074895 which actually starts from the same interim sqlite database.

In other words, no issues with cubes at species level of other countries and the cube of modelling species at European level contained in the same zenodo package of the Belgian one. In other words, the code is correct. I remember I had some problems while performing the 3_assign_grid.R pipeline (laptop stopped) and instead of starting from scratch again I tried to fix them in the sqlite database directly to save time. I am sorry for this. I will make the BE cube again from scratch and update the zenodo repository afterwards. No change in code should be required.

damianooldoni commented 9 months ago

@SanderDevisscher: issue closed automatically. I have checked myself before pushing to main and publishing a new version on zenodo (see 20240118 version), but I would like you double check it. Thanks.

SanderDevisscher commented 9 months ago

@SanderDevisscher: issue closed automatically. I have checked myself before pushing to main and publishing a new version on zenodo (see 20240118 version), but I would like you double check it. Thanks.

I will