Open aaawork opened 9 months ago
G:\awh\pmf_cvpr22-main\engine.py:115: UserWarning: The structure of <datasets.get_bscd_loader.._Loader object at 0x000002117F54D2E0> is not recognizable. warnings.warn(f'The structure of {data_loaders} is not recognizable.') Traceback (most recent call last): File "test_bscdfsl.py", line 116, in main(args) File "test_bscdfsl.py", line 62, in main test_stats = evaluate(data_loader_val, model, criterion, device, seed=1234, ep=5) File "G:\awh\pmf_cvpr22-main\engine.py", line 116, in evaluate return _evaluate(data_loaders, model, criterion, device, seed) File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context return func(*args, kwargs) File "G:\awh\pmf_cvpr22-main\engine.py", line 134, in _evaluate for ii, batch in enumerate(metric_logger.log_every(data_loader, 10, header)): File "G:\awh\pmf_cvpr22-main\utils\deit_util.py", line 141, in log_every for obj in iterable: File "G:\awh\pmf_cvpr22-main\datasetsinit.py", line 178, in _loader_wrap for x, y in novel_loader: File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter return self._get_iterator() File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 918, in init w.start() File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\process.py", line 121, in start self._popen = self._Popen(self) File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\context.py", line 326, in _Popen return Popen(process_obj) File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\popen_spawn_win32.py", line 93, in init** reduction.dump(process_obj, to_child) File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) _pickle.PicklingError: Can't pickle <function at 0x0000020E0E743040>: attribute lookup on datasets.cdfsl.CropDisease_few_shot failed Traceback (most recent call last): File "", line 1, in File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input
Hi, have you solved this issue yet? I ran into the exact same issue as yours.
Hi guys, I guess it is probably due to the version of Python or Pytorch, that my code might violate some assertions in multiprocessing
for example. Here is an answer from ChatGPT:
The error you're encountering involves a failure to pickle a function. This is a common issue in multiprocessing, especially when trying to share functions or objects that are not easily serializable.
Let's address the specific issues:
1. **Can't Pickle Function**:
The error `Can't pickle <function at ...>: attribute lookup on datasets.cdfsl.CropDisease_few_shot failed` suggests that the function you're trying to pickle cannot be found or properly referenced. This often happens with nested functions, lambdas, or functions defined inside classes.
2. **EOFError**:
This is a secondary error likely caused by the failure to pickle the function, resulting in incomplete data being sent to the subprocess.
### Solutions
#### 1. Ensure Functions are Top-Level
Ensure that the function you are trying to use with multiprocessing is defined at the top level of a module. Functions defined within other functions, methods within classes, or lambdas often cannot be pickled.
Example:
```python
# Incorrect
class MyClass:
def method(self):
def inner_function():
pass
# Correct
def my_function():
pass
multiprocessing.Manager
or multiprocessing.Pool
Using multiprocessing.Manager
or multiprocessing.Pool
can help manage the processes more effectively. Managers provide a way to create shared objects that can be passed between processes.
multiprocessing.Pool
Here's an example demonstrating the use of multiprocessing.Pool
to avoid pickling issues with functions:
import multiprocessing
def crop_disease_few_shot(data):
# Your processing function here
return data * 2 # Example operation
if __name__ == '__main__':
data = [1, 2, 3, 4, 5]
with multiprocessing.Pool() as pool:
results = pool.map(crop_disease_few_shot, data)
print(results)
import multiprocessing
# Ensure the function is defined at the top level
def crop_disease_few_shot(data):
# Process the data here
return data * 2 # Example operation
def main():
data = [1, 2, 3, 4, 5]
with multiprocessing.Pool(processes=4) as pool:
results = pool.map(crop_disease_few_shot, data)
print(results)
if __name__ == '__main__':
main()
If you need to use a method inside a class, make sure to use staticmethod
or move the method outside the class.
import multiprocessing
class CropDisease:
@staticmethod
def few_shot(data):
# Your processing function here
return data * 2 # Example operation
def main():
data = [1, 2, 3, 4, 5]
with multiprocessing.Pool(processes=4) as pool:
results = pool.map(CropDisease.few_shot, data)
print(results)
if __name__ == '__main__':
main()
By ensuring functions are at the top level, avoiding lambdas and nested functions, and using multiprocessing.Pool
correctly, you should be able to avoid the EOFError
and pickling issues. If you provide more details or code snippets, I can give more specific advice.
Hi guys, I guess it is probably due to the version of Python or Pytorch, that my code might violate some assertions in
multiprocessing
for example. Here is an answer from ChatGPT:The error you're encountering involves a failure to pickle a function. This is a common issue in multiprocessing, especially when trying to share functions or objects that are not easily serializable. Let's address the specific issues: 1. **Can't Pickle Function**: The error `Can't pickle <function at ...>: attribute lookup on datasets.cdfsl.CropDisease_few_shot failed` suggests that the function you're trying to pickle cannot be found or properly referenced. This often happens with nested functions, lambdas, or functions defined inside classes. 2. **EOFError**: This is a secondary error likely caused by the failure to pickle the function, resulting in incomplete data being sent to the subprocess. ### Solutions #### 1. Ensure Functions are Top-Level Ensure that the function you are trying to use with multiprocessing is defined at the top level of a module. Functions defined within other functions, methods within classes, or lambdas often cannot be pickled. Example: ```python # Incorrect class MyClass: def method(self): def inner_function(): pass # Correct def my_function(): pass
2. Use
multiprocessing.Manager
ormultiprocessing.Pool
Using
multiprocessing.Manager
ormultiprocessing.Pool
can help manage the processes more effectively. Managers provide a way to create shared objects that can be passed between processes.3. Example with
multiprocessing.Pool
Here's an example demonstrating the use of
multiprocessing.Pool
to avoid pickling issues with functions:import multiprocessing def crop_disease_few_shot(data): # Your processing function here return data * 2 # Example operation if __name__ == '__main__': data = [1, 2, 3, 4, 5] with multiprocessing.Pool() as pool: results = pool.map(crop_disease_few_shot, data) print(results)
Example with a Top-Level Function
import multiprocessing # Ensure the function is defined at the top level def crop_disease_few_shot(data): # Process the data here return data * 2 # Example operation def main(): data = [1, 2, 3, 4, 5] with multiprocessing.Pool(processes=4) as pool: results = pool.map(crop_disease_few_shot, data) print(results) if __name__ == '__main__': main()
Additional Tips
- Avoid Lambdas: Lambdas are not pickleable. Replace them with named functions.
- Avoid Nested Functions: Move nested functions to the top level of the module.
- Check Imports: Ensure all necessary imports are available in the module where the top-level function is defined.
Example for Functions Inside a Class
If you need to use a method inside a class, make sure to use
staticmethod
or move the method outside the class.import multiprocessing class CropDisease: @staticmethod def few_shot(data): # Your processing function here return data * 2 # Example operation def main(): data = [1, 2, 3, 4, 5] with multiprocessing.Pool(processes=4) as pool: results = pool.map(CropDisease.few_shot, data) print(results) if __name__ == '__main__': main()
By ensuring functions are at the top level, avoiding lambdas and nested functions, and using
multiprocessing.Pool
correctly, you should be able to avoid theEOFError
and pickling issues. If you provide more details or code snippets, I can give more specific advice.
Hi, thank you very much for your reply and many thanks for sharing this repo! I believe that I solved this issue by replacing the lambda function with a properly defined function. But I have another question: normally, how long will test_bscdfsl.py run on the ChestX dataset? I have used:
!python test_bscdfsl.py --test_n_way 5 --n_shot 5 --device cuda:0 --arch dino_small_patch16 --deploy finetune --output outputs/dino_small_cifar_1 --resume outputs/dino_small_cifar_1/best.pth --cdfsl_domains ChestX --ada_steps 100 --ada_lr 0.0001 --aug_prob 0.9 --aug_types color translation
Do you have any idea about roughly how long this will run on a RTX 4090? Many thanks.
It shouldn't be too long :) But I was running experiments on A40. I'll try to find my logs and let you know tomorrow.
Thank you! It is now running for 2 hours and still keep running...
G:\awh\pmf_cvpr22-main\engine.py:115: UserWarning: The structure of <datasets.get_bscd_loader.._Loader object at 0x000002117F54D2E0> is not recognizable.
warnings.warn(f'The structure of {data_loaders} is not recognizable.')
Traceback (most recent call last):
File "test_bscdfsl.py", line 116, in
main(args)
File "test_bscdfsl.py", line 62, in main
test_stats = evaluate(data_loader_val, model, criterion, device, seed=1234, ep=5)
File "G:\awh\pmf_cvpr22-main\engine.py", line 116, in evaluate
return _evaluate(data_loaders, model, criterion, device, seed)
File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "G:\awh\pmf_cvpr22-main\engine.py", line 134, in _evaluate
for ii, batch in enumerate(metric_logger.log_every(data_loader, 10, header)):
File "G:\awh\pmf_cvpr22-main\utils\deit_util.py", line 141, in log_every
for obj in iterable:
File "G:\awh\pmf_cvpr22-main\datasets__init.py", line 178, in _loader_wrap
for x, y in novel_loader:
File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 359, in iter
return self._get_iterator()
File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\dhs.conda\envs\Hlbl\lib\site-packages\torch\utils\data\dataloader.py", line 918, in init
w.start()
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\popen_spawn_win32.py", line 93, in init__
reduction.dump(process_obj, to_child)
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000020E0E743040>: attribute lookup on datasets.cdfsl.CropDisease_few_shot failed
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\dhs.conda\envs\Hlbl\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input