EntilZha / PyFunctional

Python library for creating data pipelines with chain functional programming
http://pyfunctional.pedro.ai
MIT License
2.41k stars 132 forks source link

lazy_parallellize having trouble with function context? #164

Open larroy opened 3 years ago

larroy commented 3 years ago

I'm using a function defined in the current file in pseq, and seems it errors out not being able to find other referenced functions or even simple types like Dict. This works fine when using seq.

I think the problem is with pickling the target function in lazy_parallelize:

    partitions = split_every(partition_size, iter(result))
    packed_partitions = (pack(func, (partition,)) for partition in partitions)
    for pool_result in pool.imap(unpack, packed_partitions):
        yield pool_result
    pool.terminate()

I executed on my own the function with pool.imap and works fine.

Wouldn't it be better not to use pickling to avoid these kind of problems?

EntilZha commented 3 years ago

Thanks for the issue report. The reason for dill/cloudpickle is that there are quite a few types that python can't pickle but that those can. I think the solution here is probably making it easy to specify which pickler (including a no-op "pickler"). I'd be open to a PR that implements this and leaves the current defaults.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

larroy commented 2 years ago

Just getting back to this one, found a very nasty bug / interaction with Spark with Python 3.7 due to pyfunctional loading dill.
https://issues.apache.org/jira/browse/SPARK-36476?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17397126#comment-17397126

I will work on a PR to specify the pickler. Can you expand on how a no-op pickler option would work? I quickly hacked together import pickle as serializer. What kind of tests would you suggest to make sure that other picklers work well?

Thank you.