gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

Adapt clustering to handle event date ranges #1030

Closed MattBlissett closed 4 months ago

MattBlissett commented 5 months ago

Clustering is reading the event date from the HDFS table. This may now be an interval.

Fixing the SQL to read a date would be easy, but checking it works nicely (finds the correct clusters) deserves more effort. If a quick fix is needed, reading eventDateGte would probably restore the previous behaviour.