This change associates an inputsplit with the node name in the hostfile. Combining this with assigning a very high value for spark.locality.wait ensures that when spark is scheduling tasks it will respect data locality and query all the nodes in a setup where the genomicsdb array has been partitioned across multiple nodes.
This change associates an inputsplit with the node name in the hostfile. Combining this with assigning a very high value for spark.locality.wait ensures that when spark is scheduling tasks it will respect data locality and query all the nodes in a setup where the genomicsdb array has been partitioned across multiple nodes.