csjliang / LPTN

Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021
Apache License 2.0
413 stars 47 forks source link

Train on custom unpaired dataset #32

Closed Lauenburg closed 2 years ago

Lauenburg commented 3 years ago

I am currently trying to test the method on a custom unpaired dataset (maps dataset containing satellite and google maps images). I have two folders testA and testB folder. Both contain multiple images from either the satellite or the google maps domain.

I have two issues:

  1. At the start, nothing worked until I made sure that for each file in testA there exists a corresponding file in testB of the same name. Why is this paring needed? I thought the method works with unpaired data!

  2. After training the model for 7.5 hours the results are look as follows (source left, target right, generated image in the middle): result_lptn

What am I missing?

I adapted the train_FiveK.yml as follows:

# general settings
name: Maps
model_type: LPTNModel
num_gpu: 1  # set num_gpu: 0 for cpu mode
manual_seed: 10

# dataset and data loader settings
datasets:
  train:
    name: Maps
    type: UnPairedImageDataset
    # (for disk)
    dataroot_gt: ../../datasets/maps/trainB
    dataroot_lq: ../../datasets/maps/trainA
    # (for lmdb)
#    dataroot_gt: datasets/FiveK/FiveK_train_target.lmdb
#    dataroot_lq: datasets/FiveK/FiveK_train_source.lmdb
    filename_tmpl: '{}'
    io_backend:
      type: disk
#       type: lmdb

    if_fix_size: true # training will be slower if the data shape not fixed (both num_gpu and batch_size need to be 1)
    gt_size: 256 # training size
    use_flip: true
    use_rot: true

    # data loader
    use_shuffle: true
    num_worker_per_gpu: 16
    batch_size_per_gpu: 32
    dataset_enlarge_ratio: 100
    prefetch_mode: cuda
    pin_memory: true

  val:
    name: Maps
    type: UnPairedImageDataset
    dataroot_gt: ../../datasets/maps/testB
    dataroot_lq: ../../datasets/maps/testA
#    dataroot_gt: datasets/FiveK/FiveK_test_target.lmdb
#    dataroot_lq: datasets/FiveK/FiveK_test_source.lmdb
    io_backend:
      type: disk
#      type: lmdb

# network structures

**From here on everything is left the same**
csjliang commented 2 years ago

Sorry for the late reply. The map translation task relies on the transformation of high-frequency components of the image, which is much more complicated and needs more parameters/FLOPS. In our paper, the aim is to accelerate the translation by avoiding heavy computation on high-frequency, so tasks that rely more on low-frequency components such as illuminations and colors are preferred. Please refer to our paper for more information. Thanks!