Closed pooya-mohammadi closed 2 years ago
From your code example, the part that fails is trying to pickle a record. Can you clarify if it's records or shapes that fail to pickle?
Either way, the error looks to be an issue of highly recursive class structures, though not sure why the Shape class would be recursive, it simply contains attributes of flat lists and a few string/int attributes.
For your use case however i would just suggest dumping to a json string instead of pickling, which is easier to work with and can easily be turned back into a Shape instance if needed:
import json
reader = ...
shape = reader.shape(0)
shape_geojson_string = json.dumps(shape.__geo_interface__)
# send to multiprocessing
# ...
@karimbahgat shapes are fine but the records are not pickable. I don't directly pickle them. Multiprocessing library pickles each task and sends it to a random worker.
Not sure why the Record instances don't pickle, but this is a common problem with multiprocessing since not all python classes can be pickled. A common solution is to convert your data to a simpler intermediate format before sending to multiprocessing. So in your case, don't send the Record classes as input tasks to multiprocessing, instead convert to some other data structure such as a dict, eg record.as_dict()
. Sending a dict to multiprocessing should be unproblematic. Remember to also update the worker script that does the actual work in multiprocessing so that it expects the same data structure.
records that are created using shapefile.Reader are pickable. I want to do a series of processes with multiprocessing but shapes does not become pickle objects and process fails.
Error message: