Open KomputerMaster64 opened 2 years ago
I altered the line 45 from img_path = os.path.join(ffhq_img_path, '%05d.png' % i)
to img_path = os.path.join(ffhq_img_path, '%05d.jpg' % i)
since the kaggle ffhq 256x256 resized dataset has .jpg
image files.
The above change has resulted in the command !python create_ffhq_lmdb.py --ffhq_img_path=$DATA_DIR/ffhq/resized/ --ffhq_lmdb_path=$DATA_DIR/ffhq/ffhq-lmdb --split=train
giving the following output
100
200
300
400
500
600
700
800
900
1000
1100
1200
.
.
.
.
.
.
I cross checked with the files that were unzipped. The number of files should be 70k but after repeated unzipping operations I am able to extract only 50k or 52k images even though the output of the cell shows the last file unzipped was 69999.jpg
Google Colab Notebook and Google Drive used for the implementation.
Command used: !unzip images1024x1024.zip -d $DATA_DIR/ffhq/
Last few lines of h the output of the cell:
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69990.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69991.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69992.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69993.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69994.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69995.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69996.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69997.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69998.jpg
inflating: /content/drive/MyDrive/Repositories/NVAE/dataset_nvae/ffhq/resized/69999.jpg
Output of the Google Drive after the operation.
I altered the line 45 from
img_path = os.path.join(ffhq_img_path, '%05d.png' % i)
toimg_path = os.path.join(ffhq_img_path, '%05d.jpg' % i)
since the kaggle ffhq 256x256 resized dataset has.jpg
image files. The above change has resulted in the command!python create_ffhq_lmdb.py --ffhq_img_path=$DATA_DIR/ffhq/resized/ --ffhq_lmdb_path=$DATA_DIR/ffhq/ffhq-lmdb --split=train
giving the following output100 200 300 400 500 600 700 800 900 1000 1100 1200 . . . . . .
After executing the command
!python create_ffhq_lmdb.py --ffhq_img_path=$DATA_DIR/ffhq/resized/ --ffhq_lmdb_path=$DATA_DIR/ffhq/ffhq-lmdb --split=train
I am getting the following output showing that the training set has been converted into the LMDB dataset:48600 48700 48800 48900 49000 ... 62800 62900 63000 added 63000 items to the LMDB datset.
HOWEVER, right after 2 minutes, the above suggested output changes to the following output:48600 48700 48800 48900 49000 49100 ... main(args.split, args.ffhq_img_path, args.ffhq_lmdb_path) File "create_ffhq_lmdb.py", line 55, in main print('added %d items to the LMDB dataset.' % count) lmdb.Error: mdb_txn_commit: Disk quota exceeded
This behaviour is not observed for the validation set. I request you to please guide me.
Thank you sir for sharing the scripts for dataset preparation. I am trying to implement the DDGAN model on the FFHQ 256x256 dataset. I have used the FFHQ 256x256 resized dataset from the kaggle since the FFHQ 1024x1024 dataset has a size of 90 GB, which exceeds the limits of my resources.
The Kaggle dataset has the files in archive.zip file, which has a directory "resized" which contains the 70k .jpg files.
The file structure is as follows: archive.zip ├ resized ├ (70k images)
I am using google drive and colab notebooks for the implementation. I am using the file setup with CODE_DIR = "/content/drive/MyDrive/Repositories/NVAE" and
DATA_DIR = "/content/drive/MyDrive/Repositories/NVAE/dataset_nvae"
. When I try to run the command!python create_ffhq_lmdb.py --ffhq_img_path=$DATA_DIR/ffhq/resized/ --ffhq_lmdb_path=$DATA_DIR/ffhq/ffhq-lmdb --split=train
, I get the following error message: