xinntao / EDVR

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.
https://github.com/xinntao/BasicSR
1.5k stars 318 forks source link

create_lmdb.py = Not enough disk space? #134

Open laurentlasalle opened 4 years ago

laurentlasalle commented 4 years ago

While preparing DIV2K dataset in LMDB format to run train.py with options/train_ESRGAN.yml, I am getting a "not enough space on the disk" error from create_lmdb.py under Windows 10.

EDVR and all other files are stored on C, which has 56.2 GB free. Am I wrong in expecting the database to be much smaller?

Reading image path list ...
data size per image is:  691200
Traceback (most recent call last):
    File "create_lmdb.py", line 411, in <module>
        main()
    File "create_lmdb.py", line 39, in main
        general_image_folder(opt)
    File "create_lmdb.py", line 105, in general_image_folder
        env = lmdb.open(lmdb_save_path, map_size=data_size * 10)
lmdb.Error: ../../datasets/DIV2K/DIV2K800_sub.lmdb: There is not enough space on the disk.
TigerZing commented 4 years ago

I think 56.2 GB free is not enough. Have you tried to modify BATCH images in the script.?

laurentlasalle commented 4 years ago

I think 56.2 GB free is not enough. Have you tried to modify BATCH images in the script.?

I tried BATCH = 500 instead of 5000, I even tried 5, but both attempt failed with the same error.

ryul99 commented 4 years ago

I think BATCH in create_lmdb.py only means how many images to load to memory at once. So making this smaller is just using less memory. I think 56.2GB free is not enough too. In REDS dataset case, lmdb of train_sharp(32GiB) is 61.9GiB. I'm not sure but I think lmdb is almost double of origin.

TigerZing commented 4 years ago

You're right @ryul99. I tried to modify it before. It will load "BATCH" images to memory at once, then write out into lmdb file. But after that, it will clean up the memory and continue to load other "BATCH" images from the dataset