zhu-xlab / SSL4EO-S12

SSL4EO-S12: a large-scale dataset for self-supervised learning in Earth observation
Apache License 2.0
179 stars 17 forks source link

Can't pickle local object 'make_lmdb.<locals>.<lambda>' #4

Closed vaasuSS closed 1 year ago

vaasuSS commented 1 year ago

When I try to run the src/benchmark/pretrain_ssl/datasets/SSL4EO/ssl4eo_dataset.py, I am getting the following error

File "ssl4eo_dataset.py", line 258, in <module>
AttributeError: Can't pickle local object 'make_lmdb.<locals>.<lambda>'

I am using the code to prepare lmdb file using ssl4eo-s12_100patches dataset

wangyi111 commented 1 year ago

Hi @vaasuSS, could you provide the detailed error message?

Just in case, below is one example that should work using the classes in ssl4eo_dataset.py:

root = './example_100_patches/'
train_dataset = SSL4EO(root=root, normalize=False, mode=['s1','s2a','s2c'], dtype='uint8')

make_lmdb(train_dataset,'./example_100_patch.lmdb',num_workers=2,mode=['s1','s2a','s2c'])
vaasuSS commented 1 year ago

Hi. Thanks for the suggestion. Actually, I am running the code with the same idea as you suggested. I am currently on Windows and using Python (3.8.15), Pytorch (1.13.0), lmdb (1.4.0). Is it possible for you to list the intended versions of these libraries?

This is the command I am using to run the script ssl4eo_dataset.py without any changes in the script

python ssl4eo_dataset.py --root "pretrain_ssl\ssl4eo-s12_100patches" --save_path "pretrain_ssl\sample.lmdb" --make_lmdb_file --num_workers 2

This is the error

Traceback (most recent call last):
  File "ssl4eo_dataset.py", line 261, in <module>
    make_lmdb(train_subset, args.save_path, num_workers=args.num_workers,
  File "ssl4eo_dataset.py", line 200, in make_lmdb
    loader = InfiniteDataLoader(dataset, num_workers=num_workers,
  File "ssl4eo_dataset.py", line 189, in __init__
    self.iterator = super().__iter__()
  File "lib\site-packages\torch\utils\data\dataloader.py", line 435, in __iter__     
    return self._get_iterator()
  File "lib\site-packages\torch\utils\data\dataloader.py", line 381, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "lib\site-packages\torch\utils\data\dataloader.py", line 1034, in __init__    
    w.start()
  File "lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'make_lmdb.<locals>.<lambda>'

  File "<string>", line 1, in <module>
  File "lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "lib\multiprocessing\spawn.py", line 126, in _main     
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
wangyi111 commented 1 year ago

Hi @vaasuSS, thanks for the information. I got the problem. The code was developed on Linux and wasn't tested on windows. I did reproduce the error on a windows machine, which seems to come from a multiprocessing issue. One ugly and quick solution is to not use multiprocess by setting num_workers to 0. Consider also reducing map_size in env = lmdb.open(lmdb_file, map_size=109951162777) if you run it on a "small" machine.

PS: For a better experience, I recommend using Linux:)

vaasuSS commented 1 year ago

Hi @wangyi111 Thanks very much for the clarification. Setting num_workers to 0 and reducing the map_size worked in Windows.

I have another question as a follow up. What's the difference between ssl4eo_dataset.py and ssl4eo_dataset_lmdb.py. I am basically trying to modify and train the SSL network (pretrain_moco_v2_s2c.py) using both s1 and s2c. So which script I should be using to prepare the lmdb dataset and do you have any general suggestions on how to train using both s1 and s2c?

Thanks very much

wangyi111 commented 1 year ago

Hi @vaasuSS,

To use both s1 and s2c, you should modify the dataloader to load both, and also adjust the encoder depending on the strategy you would like to use for modality fusion. For example, you can create one lmdb file with both modalities and load a batch of s1 and s2c pairs; then, if you want to do early fusion, you can concatenate the two modalities as one united input, and increase the input channel numbers; further training should be the same as for the single modality.

PS: Currently our codes for MoCo_v2 only support multiple GPUs as https://github.com/facebookresearch/moco. If you want to train on a single GPU, you may have a look at https://colab.research.google.com/github/facebookresearch/moco/blob/colab-notebook/colab/moco_cifar10_demo.ipynb. This demo kind of simulates multiGPU with splitBatchNorm.