Closed vaasuSS closed 1 year ago
Hi @vaasuSS, could you provide the detailed error message?
Just in case, below is one example that should work using the classes in ssl4eo_dataset.py
:
root = './example_100_patches/'
train_dataset = SSL4EO(root=root, normalize=False, mode=['s1','s2a','s2c'], dtype='uint8')
make_lmdb(train_dataset,'./example_100_patch.lmdb',num_workers=2,mode=['s1','s2a','s2c'])
Hi. Thanks for the suggestion. Actually, I am running the code with the same idea as you suggested. I am currently on Windows and using Python (3.8.15)
, Pytorch (1.13.0)
, lmdb (1.4.0)
. Is it possible for you to list the intended versions of these libraries?
This is the command I am using to run the script ssl4eo_dataset.py without any changes in the script
python ssl4eo_dataset.py --root "pretrain_ssl\ssl4eo-s12_100patches" --save_path "pretrain_ssl\sample.lmdb" --make_lmdb_file --num_workers 2
This is the error
Traceback (most recent call last):
File "ssl4eo_dataset.py", line 261, in <module>
make_lmdb(train_subset, args.save_path, num_workers=args.num_workers,
File "ssl4eo_dataset.py", line 200, in make_lmdb
loader = InfiniteDataLoader(dataset, num_workers=num_workers,
File "ssl4eo_dataset.py", line 189, in __init__
self.iterator = super().__iter__()
File "lib\site-packages\torch\utils\data\dataloader.py", line 435, in __iter__
return self._get_iterator()
File "lib\site-packages\torch\utils\data\dataloader.py", line 381, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "lib\site-packages\torch\utils\data\dataloader.py", line 1034, in __init__
w.start()
File "lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'make_lmdb.<locals>.<lambda>'
File "<string>", line 1, in <module>
File "lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Hi @vaasuSS, thanks for the information. I got the problem. The code was developed on Linux and wasn't tested on windows. I did reproduce the error on a windows machine, which seems to come from a multiprocessing issue. One ugly and quick solution is to not use multiprocess by setting num_workers
to 0. Consider also reducing map_size
in env = lmdb.open(lmdb_file, map_size=109951162777)
if you run it on a "small" machine.
PS: For a better experience, I recommend using Linux:)
Hi @wangyi111 Thanks very much for the clarification. Setting num_workers
to 0 and reducing the map_size
worked in Windows.
I have another question as a follow up. What's the difference between ssl4eo_dataset.py and ssl4eo_dataset_lmdb.py. I am basically trying to modify and train the SSL network (pretrain_moco_v2_s2c.py) using both s1
and s2c
. So which script I should be using to prepare the lmdb dataset and do you have any general suggestions on how to train using both s1
and s2c
?
Thanks very much
Hi @vaasuSS,
ssl4eo_dataset.py
is used to load data from raw Geotiff files, it also includes the function tools to create lmad file. I use lmdb to avoid reaching inode limits in my server, it also accelerates the data loading.ssl4eo_dataset_lmdb.py
is used to load data from lmdb file.To use both s1 and s2c, you should modify the dataloader to load both, and also adjust the encoder depending on the strategy you would like to use for modality fusion. For example, you can create one lmdb file with both modalities and load a batch of s1 and s2c pairs; then, if you want to do early fusion, you can concatenate the two modalities as one united input, and increase the input channel numbers; further training should be the same as for the single modality.
PS: Currently our codes for MoCo_v2 only support multiple GPUs as https://github.com/facebookresearch/moco. If you want to train on a single GPU, you may have a look at https://colab.research.google.com/github/facebookresearch/moco/blob/colab-notebook/colab/moco_cifar10_demo.ipynb. This demo kind of simulates multiGPU with splitBatchNorm.
When I try to run the src/benchmark/pretrain_ssl/datasets/SSL4EO/ssl4eo_dataset.py, I am getting the following error
I am using the code to prepare lmdb file using
ssl4eo-s12_100patches
dataset