Training CycleGAN with 2D data

phamnam95 commented 6 years ago

Hello. I am training CycleGAN with 2D data but the range is not from 0 to 255 like in image data. I saw in the code, the data is normalized by (data-127.5)/127.5 to be in between -1 and 1. The output from generator is activated by tanh function to be in between -1 and 1. Then the result is converted back to RGB by (output*127.5)+127.5. I am wondering if I train with 2D data in a random range, how can I train and output result. Thanks

ssnl commented 6 years ago

What's your "random range"

phamnam95 commented 6 years ago

I am using seismic images. After normalizing the data by using (data-mean(data))/std(data), the range is from -10 to 10.

ssnl commented 6 years ago

Range is bounded it seems? Then you can just similarly normalize to [-1, +1] range, without using dataset mean and std.

phamnam95 commented 6 years ago

Is this neccesary to put tanh as activation function in last layer? Can I use relu, lrelu or not using activation function at all? I tried normalizing the input to [-1,1] but the result is not as good as normalizing using mean, std, and not using tanh as activation function.

ssnl commented 6 years ago

tanh makes the result bounded. relu can potentially give you unbounded ones.

John1231983 commented 5 years ago

@phamnam95 : So do you use tanh in last layer? What kind of normalization do you prefer? I want to use (data-mean(data))/std(data) but it seem that pytorch has no layer to normalize zero mean and unit variance

skaudrey commented 5 years ago

Here are my data: 2D, not image, and each feature has different range. If I did not rescale my data, then G_loss is usually at 1000 or more. Aftre rescaled my data into [-1,1], G_loss went down to 0.8, and D_loss is mostly at 0.5. Here are my questions:

It seems the bias between generated data and ground truth is sensbile of the range. However, not like images which I know exactly the range is [0,255], the range of my data are done by sampling approximation. So, should I concentrate on getting a better approximation of range?
Is it a good result with D_loss mostly at 0.5?
Can you give me more advice for these kind of data?

Thanks anyway. @SsnL

ssnl commented 5 years ago

Generators in this repo is designed for data in range [-1, 1], as you can see from the tanh at the end. So data with larger range will definitely fail. Moreover, it is generally a good idea to normalize data in DL, as it empirically stables training.

In GAN training, D loss is often not a very good metric for quality. You should look at the results and see how they are, or evaluate them with domain specific metrics if you have any.

On Mon, Oct 21, 2019 at 03:07 Skaudrey notifications@github.com wrote:

Here are my data: 2D, not image, and each feature has different range. If I did not rescale my data, then G_loss is usually at 1000 or more. Aftre rescaled my data into [-1,1], G_loss went down to 0.8, and D_loss is mostly at 0.5. Here are my questions:

1.

It seems the bias between generated data and ground truth is sensbile of the range. However, not like images which I know exactly the range is [0,255], the range of my data are done by sampling approximation. So, should I concentrate on getting a better approximation of range? 2.

Is it a good result with D_loss mostly at 0.5? 3.

Can you give me more advice for these kind of data?

Thanks anyway. @SsnL https://github.com/SsnL

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/304?email_source=notifications&email_token=ABLJMZNTTQW7JMAA2ZCJN7DQPVIMFA5CNFSM4FGI3LV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBZJDDY#issuecomment-544379279, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLJMZM5VHKQKJ2KAETOIMTQPVIMFANCNFSM4FGI3LVQ .

skaudrey commented 5 years ago

Generators in this repo is designed for data in range [-1, 1], as you can see from the tanh at the end. So data with larger range will definitely fail. Moreover, it is generally a good idea to normalize data in DL, as it empirically stables training. In GAN training, D loss is often not a very good metric for quality. You should look at the results and see how they are, or evaluate them with domain specific metrics if you have any. … On Mon, Oct 21, 2019 at 03:07 Skaudrey @.***> wrote: Here are my data: 2D, not image, and each feature has different range. If I did not rescale my data, then G_loss is usually at 1000 or more. Aftre rescaled my data into [-1,1], G_loss went down to 0.8, and D_loss is mostly at 0.5. Here are my questions: 1. It seems the bias between generated data and ground truth is sensbile of the range. However, not like images which I know exactly the range is [0,255], the range of my data are done by sampling approximation. So, should I concentrate on getting a better approximation of range? 2. Is it a good result with D_loss mostly at 0.5? 3. Can you give me more advice for these kind of data? Thanks anyway. @SsnL https://github.com/SsnL — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#304?email_source=notifications&email_token=ABLJMZNTTQW7JMAA2ZCJN7DQPVIMFA5CNFSM4FGI3LV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBZJDDY#issuecomment-544379279>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLJMZM5VHKQKJ2KAETOIMTQPVIMFANCNFSM4FGI3LVQ .

Thanks. I used CycleGAN, and choose to save the model with a smaller X2Y generator loss and Y2X generator loss, while generator loss = generative loss + cycle loss + identity loss. However, I found that my identity loss always converges to 1 and so is cycle loss.

The shape of my X data is 13, and shape of Y is 203, to build generators easily, I expand my X by copying it 20 times, and thus the variety in X is smaller than Y.

I tried to give a soft label for discriminators (recommended here: https://github.com/soumith/ganhacks at item 6), but the generators did not improve.

Any tips for me? @SsnL

liuhh02 commented 5 years ago

Generators in this repo is designed for data in range [-1, 1], as you can see from the tanh at the end. So data with larger range will definitely fail. Moreover, it is generally a good idea to normalize data in DL, as it empirically stables training.

@SsnL How would you suggest normalizing the data to the range [-1, 1], and "de-normalize" the data back to the original values during testing? I'm currently doing normalized_data= (data - data_avg) / data_avg where data_avg = (data.min() + data.max()) / 2 to normalize the values to [-1, 1]. But because my datatype doesn't have a fixed minimum and maximum value, I just take the maximum and minimum of the training dataset to determine the average. Is this correct? Or should I be normalizing the data based on each individual image's minimum and maximum values? Also, how can I "de-normalize" the values back to the original?

junyanz / pytorch-CycleGAN-and-pix2pix

Training CycleGAN with 2D data #304