As mentioned by @damianooldoni in riparias/gbif-alert#40, it would be good that the data filtering is done as much as possible at the "GBIF download" stage (and not in the import script itself).
It would be good to already review the existing filtering (records without at least a year, a location, ...), the planned filtering (absence records, ...) and see if this can be already pushed to the "generate GBIF download" step.
Some related things we should think about:
should we make the import script more defensive in case a user attempt to import a "less filtered" DWC-A file by using the --source-dwca option of the import_occurrences management command?
should we keep a more details log/report in the DataImport model (i.e. list exactly which records were skipped and why)?
As mentioned by @damianooldoni in riparias/gbif-alert#40, it would be good that the data filtering is done as much as possible at the "GBIF download" stage (and not in the import script itself).
It would be good to already review the existing filtering (records without at least a year, a location, ...), the planned filtering (absence records, ...) and see if this can be already pushed to the "generate GBIF download" step.
Some related things we should think about:
--source-dwca
option of theimport_occurrences
management command?DataImport
model (i.e. list exactly which records were skipped and why)?