trias-project / occ-cube-alien

🗺 Occurrence cubes for non-native taxa in Belgium and Europe
MIT License
2 stars 1 forks source link

Add create_db.Rmd #5

Closed damianooldoni closed 5 years ago

damianooldoni commented 5 years ago

This PR adds pipeline create_db.Rmd which creates a sqlite database file with two tables:

  1. occ_be_all: an exact copy of the text file with GBIF occurrences. with ALL columns
  2. occ_be: a subset of columns AND filtered occurences based on occurreceStatus and issues

The reason to have two tables is that there is no way to DROP columns in sqlite. Moreover, removing the big table after creating the smaller one means calling the VACUUM operator which takes a lot of disk space and computational time.

When we will solve the bug in inborutils::csv_to_sqlite() (see https://github.com/inbo/inborutils/issues/42) I can just use one table only and using DELETE to filter occurrences. At the moment, we can take this solution and go further, I think.

damianooldoni commented 5 years ago

I close this PR as we started to work on PR #6.