Open romulogoncalves opened 6 years ago
Another tuning point, is to have a task per executor/worker. To do that we need to set that each task should take as many cores as the worker's cores, and for that we use spark.task.cpus
.
spark.task.cpus 1 Number of cores to allocate for each task.
default value 1
Another solutions is to create a partition per LAS/LAZ file. A executor will only run one partition at the time.
Geotrellis-point cloud uses system's
/tmp/
to download the necessary LAZ files to execute a pipeline. Before reducing the number of files to be downloaded, we need to set this path to something else.Solution: