Open szymek1 opened 2 years ago
Have you solved this problem? I think issue 54# have the same problem with you and you can check it out.
BTW, I'm trying to train on my own dataset as well. And I'm confused about the edge.flist (i.e. what you used in your config), I'm not sure which data should I use in each training stage. Could you please share some tips on it?
Hello, I'm constantly facing this issue: I try to train whichever model avaiable and it immediatley stops right after it started. I set up environment which I guess should be fine:
I set batch size to 1, because I thought that maybe there is a problem with to big batch size, as I have only 1 GPU. My guess is that on VM from colab the nviddia drivers, CUDA and cuDNN are much younger than what was used back in 2019. Nevertheless, here is my configuration as well as outcome. Please, help me guys!
MODE: 1 # 1: train, 2: test, 3: eval MODEL: 2 # 1: edge model, 2: inpaint model, 3: edge-inpaint model, 4: joint model MASK: 3 # 1: random block, 2: half, 3: external, 4: (external, random block), 5: (external, random block, half) EDGE: 1 # 1: canny, 2: external NMS: 1 # 0: no non-max-suppression, 1: applies non-max-suppression on the external edges by multiplying by Canny SEED: 10 # random seed GPU: [0] # list of gpu ids DEBUG: 0 # turns on debugging mode VERBOSE: 1 # turns on verbose mode in the output console
TRAIN_FLIST: xxxx VAL_FLIST: xxxx TEST_FLIST: xxxx
TRAIN_EDGE_FLIST: ./datasets/places2_edges_train.flist VAL_EDGE_FLIST: ./datasets/places2_edges_val.flist TEST_EDGE_FLIST: ./datasets/places2_edges_test.flist
TRAIN_MASK_FLIST: xxxx VAL_MASK_FLIST: xxxx TEST_MASK_FLIST: xxxx
LR: 0.001 # learning rate D2G_LR: 0.1 # discriminator/generator learning rate ratio BETA1: 0.0 # adam optimizer beta1 BETA2: 0.9 # adam optimizer beta2 BATCH_SIZE: 1 # input batch size for training INPUT_SIZE: 256 # input image size for training, 256 for original size SIGMA: 2 # standard deviation of the Gaussian filter used in Canny edge detector (0: random, -1: no edge) MAX_ITERS: 2 # maximum number of iterations to train the model
EDGE_THRESHOLD: 0.5 # edge detection threshold L1_LOSS_WEIGHT: 1 # l1 loss weight FM_LOSS_WEIGHT: 10 # feature-matching loss weight STYLE_LOSS_WEIGHT: 250 # style loss weight CONTENT_LOSS_WEIGHT: 0.1 # perceptual loss weight INPAINT_ADV_LOSS_WEIGHT: 0.1 # adversarial loss weight
GAN_LOSS: nsgan # nsgan | lsgan | hinge GAN_POOL_SIZE: 0 # fake images pool size
SAVE_INTERVAL: 2 # how many iterations to wait before saving model (0: never) SAMPLE_INTERVAL: 2 # how many iterations to wait before sampling (0: never) SAMPLE_SIZE: 24 # number of images to sample EVAL_INTERVAL: 2 # how many iterations to wait before model evaluation (0: never) LOG_INTERVAL: 1 # how many iterations to wait before logging training status (0: never)
start training...
Training epoch: 1
End training....