gbif / pipelines

Pipelines for data processing (GBIF and LivingAtlases)
Apache License 2.0
40 stars 28 forks source link

`livingatlas` Change DwCA to verbatim to produce multiple shards #1008

Open djtfmartin opened 8 months ago

djtfmartin commented 8 months ago

The dwca-avro step is currently producing a single AVRO file which is proving to be problematic for large datasets.