this issue is discussing how to expose an interface in larky for parallel processing of data. currently larky is single threaded but many files for batch processing lend themselves to parallelism.
interface for multiprocessing.map would be something like multiprocessing.map(iterator, transformer) where transformer would be a lambda that takes each element along with the ctx and return the output of the transform.
assume input is a stream like object for sftp files or http object for http requests.
multiprocessing.map would be some interface to some execution framework such as spark which would execute the lambda and use the number of processes that customer has provisioned.
this issue is discussing how to expose an interface in larky for parallel processing of data. currently larky is single threaded but many files for batch processing lend themselves to parallelism.
interface for
multiprocessing.map
would be something likemultiprocessing.map(iterator, transformer)
wheretransformer
would be a lambda that takes each element along with thectx
and return the output of the transform.assume
input
is a stream like object for sftp files or http object for http requests.multiprocessing.map
would be some interface to some execution framework such as spark which would execute the lambda and use the number of processes that customer has provisioned.