This PR closes #46 and #47 by documenting which platforms can execute parallel ETL flows implemented pygrametl. Currently, pygrametl supports executing parallel ETL flows using CPython on platforms that start new processes using fork and Jython. Thus, executing a parallel ETL flow natively on Microsoft Windows using CPython is not supported, and macOS must be configured to use fork using multiprocessing.set_start_method('fork') due to the issues with macOS's fork implementation documented in CPython Issue 77906 (Thanksto @mFeigeInvia). An attempt to support spawn was made, however, it became clear that this would require major changes to pygrametl. This is primarily due to limitations of pickle and additional requirements when using spawn or forkserver compared to fork. As CPython generally does not perform well when executing parallel ETL flows compared to Jython, @chrthomsen, @fromm1990, and I agreed to prioritize other improvements to pygrametl.
This PR closes #46 and #47 by documenting which platforms can execute parallel ETL flows implemented pygrametl. Currently, pygrametl supports executing parallel ETL flows using CPython on platforms that start new processes using
fork
and Jython. Thus, executing a parallel ETL flow natively on Microsoft Windows using CPython is not supported, and macOS must be configured to usefork
usingmultiprocessing.set_start_method('fork')
due to the issues with macOS'sfork
implementation documented in CPython Issue 77906 (Thanks to @mFeigeInvia). An attempt to supportspawn
was made, however, it became clear that this would require major changes to pygrametl. This is primarily due to limitations ofpickle
and additional requirements when usingspawn or forkserver
compared tofork
. As CPython generally does not perform well when executing parallel ETL flows compared to Jython, @chrthomsen, @fromm1990, and I agreed to prioritize other improvements to pygrametl.