Open rbavery opened 4 months ago
A Spark executor would be a great addition. I just added some notes about implementing a new executor in #498 if you're interested in having a go at this @rbavery?
I'm definitely interested, thanks for adding notes. It's possible I won't make quick (or any) progress because of other responsibilities 😬
Could Spark be added as a supported executor?
Maybe RDD.map or RDD.mapPartitions would be the correct way to map a function similar to
map_unordered
in the Lithops executor.https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.RDD.mapPartitions.html#pyspark.RDD.mapPartitions
To support this a guess would need to be made up front on the reserved memory available for python UDFs. It sounds like currently this would be done globally but maybe later could be done on a per-operator basis?