AIM-Harvard / foundation-cancer-image-biomarker

Code and evaluation repository for the paper
https://aim-harvard.github.io/foundation-cancer-image-biomarker/
MIT License
66 stars 8 forks source link

AttributeError: Can't pickle local object 'get_transforms.<locals>.<lambda>' #293

Closed mitch-parker closed 2 months ago

mitch-parker commented 3 months ago

My images are loading correctly based on running visualize_seed_point. However, when I go to run get_features with the proper csv as input, I get the error below. The environment is python=3.10, on an M2 Mac with 16 GB memory (macOS Ventura version 13.0). I installed FMCIB via pip. Thank you in advance for your help!


AttributeError Traceback (most recent call last) Cell In[8], line 1 ----> 1 feature_df = get_features(in_path) 2 feature_df.head()

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/site-packages/fmcib/run.py:55, in get_features(csv_path, weights_path, spatial_size, precropped) 52 logger.info("Running inference over batches ...") 54 model.eval() ---> 55 for batch in tqdm(dataloader, total=len(dataloader)): 56 feature = model(batch.to(device)).detach().cpu().numpy() 57 feature_list.append(feature)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/site-packages/tqdm/std.py:1181, in tqdm.iter(self) 1178 time = self._time 1180 try: -> 1181 for obj in iterable: 1182 yield obj 1183 # Update and possibly print the progressbar. 1184 # Note: does not call self.update(1) for speed optimisation.

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:439, in DataLoader.iter(self) 437 return self._iterator 438 else: --> 439 return self._get_iterator()

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:387, in DataLoader._get_iterator(self) 385 else: 386 self.check_worker_number_rationality() --> 387 return _MultiProcessingDataLoaderIter(self)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/site-packages/torch/utils/data/dataloader.py:1040, in _MultiProcessingDataLoaderIter.init(self, loader) 1033 w.daemon = True 1034 # NB: Process.start() actually take some time as it needs to 1035 # start a process and pass the arguments over via a pipe. 1036 # Therefore, we only add a worker to self._workers list after 1037 # it started, so that we do not call .join() if program dies 1038 # before it starts, and del tries to join but will get: 1039 # AssertionError: can only join a started process. -> 1040 w.start() 1041 self._index_queues.append(index_queue) 1042 self._workers.append(w)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/process.py:121, in BaseProcess.start(self) 118 assert not _current_process._config.get('daemon'), \ 119 'daemonic processes are not allowed to have children' 120 _cleanup() --> 121 self._popen = self._Popen(self) 122 self._sentinel = self._popen.sentinel 123 # Avoid a refcycle if the target function holds an indirect 124 # reference to the process object (see bpo-30775)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/context.py:224, in Process._Popen(process_obj) 222 @staticmethod 223 def _Popen(process_obj): --> 224 return _default_context.get_context().Process._Popen(process_obj)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/context.py:284, in SpawnProcess._Popen(process_obj) 281 @staticmethod 282 def _Popen(process_obj): 283 from .popen_spawn_posix import Popen --> 284 return Popen(process_obj)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/popen_spawn_posix.py:32, in Popen.init(self, process_obj) 30 def init(self, process_obj): 31 self._fds = [] ---> 32 super().init(process_obj)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/popen_fork.py:19, in Popen.init(self, process_obj) 17 self.returncode = None 18 self.finalizer = None ---> 19 self._launch(process_obj)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/popen_spawn_posix.py:47, in Popen._launch(self, process_obj) 45 try: 46 reduction.dump(prep_data, fp) ---> 47 reduction.dump(process_obj, fp) 48 finally: 49 set_spawning_popen(None)

File /opt/anaconda3/envs/fmcib_env/lib/python3.10/multiprocessing/reduction.py:60, in dump(obj, file, protocol) 58 def dump(obj, file, protocol=None): 59 '''Replacement for pickle.dump() using ForkingPickler.''' ---> 60 ForkingPickler(file, protocol).dump(obj)

AttributeError: Can't pickle local object 'get_transforms..'

github-actions[bot] commented 3 months ago

Hello @mitch-parker, thank you for your interest in our work!

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

surajpaib commented 3 months ago

Thanks for opening this issue!

@dosu-bot Can you troubleshoot this for me?

dosubot[bot] commented 3 months ago

Hey there, @surajpaib! Great to see you back. Hope you're doing awesome! 🚀

The error "AttributeError: Can't pickle local object 'get_transforms..'" you're encountering is likely due to the use of lambda functions within objects that are being pickled, such as when using multiprocessing or when objects need to be serialized/deserialized. However, the provided code snippet does not explicitly show the use of lambda functions or the direct implementation of pickling. The issue might be indirectly related to how data loaders or models are being handled, especially if they involve operations that require serialization (like multiprocessing in PyTorch's DataLoader).

In PyTorch, the DataLoader can use multiple worker processes to load the data, which involves pickling the objects to transfer them between processes. If your get_dataloader function or any part of the data loading, preprocessing, or model definition uses lambda functions or other local objects that cannot be pickled, this could lead to the error you're seeing.

To resolve this issue, you can try the following approaches:

  1. Avoid Lambda Functions in Pickle Contexts: If lambda functions are used in the dataset's transformations or anywhere they might be pickled (e.g., in get_dataloader), replace them with regular functions. Lambda functions cannot be pickled because they are defined inline and lack the name attribute that the pickle module requires for serialization.

  2. Use dill Instead of pickle for Serialization: If you have control over the serialization process, you might use the dill library, which can serialize more types of objects than pickle. However, this approach might not be applicable if the serialization is happening inside library code that you cannot change.

  3. Debug Serialization: Temporarily disable multiprocessing (set num_workers=0 in your DataLoader) to see if the issue is related to the multiprocessing context. If the error goes away, it confirms that the issue is related to pickling for multiprocessing.

Since the provided code does not include the implementation of get_dataloader or details about the model's internals (like the LoadModel class or fmcib_model function), I cannot pinpoint the exact location or use of a lambda function causing the issue. You'll need to inspect these components for lambda functions or other non-pickleable objects used in a context where serialization is required.

For more details, you can refer to the PyTorch DataLoader documentation and the Python pickle documentation.

Sources

#### About Dosu This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.
surajpaib commented 3 months ago

Hi @mitch-parker, it seems like this is a multiprocessing issue on the Mac.

I've added the ability to pass extra keyword arguments to the get_features function. You should be able to set num_workers=0, which would disable multiprocessing. Could you check this and let me know if this is fixed? Unfortunately, this would make the inference much slower, so I'll look into better solutions once we confirm this is the issue.

I think the Mhub implementation would be the recommended way for now. We can discuss more soon.

LennyN95 commented 3 months ago

Hey, @mitch-parker, you can find a MHub packaged version of this model on MHub under https://mhub.ai/models/fmcib_radiomics. If you have docker desktop installed, running the model doesn't take more than one command (you'll find all details on the model page but in case you have any questions please don't hesitate to comment here!).

mitch-parker commented 3 months ago

Thanks! It works now with num_workers=0. It took ~5 seconds per file.