Closed huangxichen1 closed 4 years ago
This is interesting... why wasn't it a problem before? I also believe this is why Preston used the pathos multiprocessing library instead of Python's built-in which has to pickle everything it calculates. There's a library in that package called dill that will do the same thing as pickle but works on lambdas you could use instead of getting rid of the pickling if we need it
This is interesting... why wasn't it a problem before? I also believe this is why Preston used the pathos multiprocessing library instead of Python's built-in which has to pickle everything it calculates. There's a library in that package called dill that will do the same thing as pickle but works on lambdas you could use instead of getting rid of the pickling if we need it
The reason that we didn't see this problem before is that we try our best to avoid lower level class calls. For example, I made some change to the Liquid Legion code to make the LiquidLegion pickle-able.
Yet, it becomes more difficult when we do the frequency evaluation. The frequency estimators may depend on the cardinality estimators and use get_sketch_factory method of the cardinality estimators, which will cause the issue that the class is called on a lower level. For example, the stratified sketch uses cardinality_sketch_factory
. Let's say we use VoC, and the pickle will raise AttributeError: Can't pickle local object 'VectorOfCounts.get_sketch_factory.<locals>.f'
.
Let me try the dill
, and report back if it is possible.
This is interesting... why wasn't it a problem before? I also believe this is why Preston used the pathos multiprocessing library instead of Python's built-in which has to pickle everything it calculates. There's a library in that package called dill that will do the same thing as pickle but works on lambdas you could use instead of getting rid of the pickling if we need it
The reason that we didn't see this problem before is that we try our best to avoid lower level class calls. For example, I made some change to the Liquid Legion code to make the LiquidLegion pickle-able.
Yet, it becomes more difficult when we do the frequency evaluation. The frequency estimators may depend on the cardinality estimators and use get_sketch_factory method of the cardinality estimators, which will cause the issue that the class is called on a lower level. For example, the stratified sketch uses
cardinality_sketch_factory
. Let's say we use VoC, and the pickle will raiseAttributeError: Can't pickle local object 'VectorOfCounts.get_sketch_factory.<locals>.f'
.Let me try the
dill
, and report back if it is possible.
Joey: I just tested that the dill
library can pickle the configs, which is promising.
So now there are two options:
pickle
to dill
, in case we need the serialized objects in the future.pickle
, as it is not used by any other modules in the framework.@all, please let me know which one you prefer.
Reasons to remove pickle:
factory
methods a lot, and it is easy to get errors. For example, the stratified sketch VoC and stratified sketch ADBF use sketch_factory, so they are not compatible with pickle. The error message looks like:where
<locals>
is a class referred by theget_sketch_factory
method.