Closed LPZliu closed 9 months ago
Thank your for brining the issue. I would gladly appreciate if you typed it in English since i can't read Chinese. When I trained the model, at epoch 200, the loss decreases until 0.37 and validation accuracies are as follows:
At the moment, I'm afraid that I can't help address the problem since it needs further investigation. Therefore, If you prefer, provide me your email address, so I can send you the pretrained model that has been trained with 32-bits of data.
thanks my email: kevin_ailover@163.com
感谢您解决这个问题。如果您用英文输入,我将不胜感激,因为我看不懂中文。当我训练模型时,在 epoch 200 时,损失减少到 0.37,验证精度如下:
- 值PSNR:42.004
- Val Crop 累积: 0.989
- Val Scale Acc: 0.981
- VAL MJPEG 累积: 0.993
目前,恐怕我无法帮助解决这个问题,因为它需要进一步调查。因此,如果您愿意,请向我提供您的电子邮件地址,以便我可以向您发送已使用 32 位数据训练的预训练模型
thanks my email: kevin_ailover@163.com
感谢您解决这个问题。如果您用英文输入,我将不胜感激,因为我看不懂中文。当我训练模型时,在 epoch 200 时,损失减少到 0.37,验证精度如下:
- 值PSNR:42.004
- Val Crop 累积: 0.989
- Val Scale Acc: 0.981
- VAL MJPEG 累积: 0.993
目前,恐怕我无法帮助解决这个问题,因为它需要进一步调查。因此,如果您愿意,请向我提供您的电子邮件地址,以便我可以向您发送已使用 32 位数据训练的预训练模型
this is my question.I do not konw how to address it. can you help me?
Can you provide more details about the training parameters, such as data_dim, seq_len, batch_size, learning rate, etc.
Can you provide more details about the training parameters, such as data_dim, seq_len, batch_size, learning rate, etc.
Could you adjust the batch size to 12 and retrain the model? Considering the learning rate, setting the batch size high could interfere with the model's convergence. If you want to set the batch size to 64, try increase the learning rate.
Note: I used batch size of 12 with learning rate of 0.0005 during my training
In fact, I've done it, not only modified bathsize but also modified lr, but I didnot address this problem.
在 2024-01-15 12:22:32,"Jae Kyu Im" @.***> 写道:
Could you adjust the batch size to 12 and retrain the model? Considering the learning rate, setting the batch size high could interfere with the model's convergence.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
The quick solution I can provide is by using this official code below. Try change the current make_pair
function with this one.
def make_pair(frames, data_dim, use_bit_inverse=True, multiplicity=1):
# Add multiplicity to further stabilize training.
frames = torch.cat([frames] * multiplicity, dim=0).cuda()
data = torch.zeros((frames.size(0), data_dim)).random_(0, 2).cuda()
# Add the bit-inverse to stabilize training.
if use_bit_inverse:
frames = torch.cat([frames, frames], dim=0).cuda()
data = torch.cat([data, 1.0 - data], dim=0).cuda()
return frames, data
The reason why I modified this function is because when the hidden data is 32 bits with all 0 or 1, the model couldn't retreived the data that well(=low accuracy). So, I let the model trained the data with bits of all 0 and 1 to increase the accuracy.
Thanks for the reply, can you please give me all the files generated by your training。 et metrics.tsv
At 2024-01-15 12:32:05, "Jae Kyu Im" @.***> wrote:
The quick solution I can provide is by using this official code below. Try change the current make_pair function with this one.
defmake_pair(frames, data_dim, use_bit_inverse=True, multiplicity=1):
data=torch.zeros((frames.size(0), data_dim)).random_(0, 2).cuda()
# Add the bit-inverse to stabilize training.ifuse_bit_inverse:
frames=torch.cat([frames, frames], dim=0).cuda()
data=torch.cat([data, 1.0-data], dim=0).cuda()
returnframes, data
The reason why I modified this function is because when the hidden data is 32 bits with all 0 or 1, the model couldn't retreived the data well(=low accuracy). So, I let the model trained the data with bits of all 0 and 1 to increase the accuracy.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Sure, I will send it via email.
These are the files generated during training.
On Mon, Jan 15, 2024 at 3:19 PM LPZliu @.***> wrote:
Thanks for the reply, can you please give me all the files generated by your training。 et metrics.tsv
At 2024-01-15 12:32:05, "Jae Kyu Im" @.***> wrote:
The quick solution I can provide is by using this official code below. Try change the current make_pair function with this one.
defmake_pair(frames, data_dim, use_bit_inverse=True, multiplicity=1):
Add multiplicity to further stabilize training.frames=torch.cat([frames]
*multiplicity, dim=0).cuda() data=torch.zeros((frames.size(0), datadim)).random(0, 2).cuda()
Add the bit-inverse to stabilize training.ifuse_bit_inverse:
frames=torch.cat([frames, frames], dim=0).cuda() data=torch.cat([data, 1.0-data], dim=0).cuda()
returnframes, data
The reason why I modified this function is because when the hidden data is 32 bits with all 0 or 1, the model couldn't retreived the data well(=low accuracy). So, I let the model trained the data with bits of all 0 and 1 to increase the accuracy.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/Peachypie98/RivaGAN/issues/2#issuecomment-1891373902, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASNPWKEODBM7H3EJIAU5G7LYOTC5PAVCNFSM6AAAAABB2UCLO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJRGM3TGOJQGI . You are receiving this because you commented.Message ID: @.***>
Sorry, I didn't see the attached file. Can you send me a new copy, thanks
在 2024-01-15 15:57:56,"Jae Kyu Im" @.***> 写道:
These are the files generated during training.
On Mon, Jan 15, 2024 at 3:19 PM LPZliu @.***> wrote:
Thanks for the reply, can you please give me all the files generated by your training。 et metrics.tsv
At 2024-01-15 12:32:05, "Jae Kyu Im" @.***> wrote:
The quick solution I can provide is by using this official code below. Try change the current make_pair function with this one.
defmake_pair(frames, data_dim, use_bit_inverse=True, multiplicity=1):
Add multiplicity to further stabilize training.frames=torch.cat([frames]
*multiplicity, dim=0).cuda() data=torch.zeros((frames.size(0), datadim)).random(0, 2).cuda()
Add the bit-inverse to stabilize training.ifuse_bit_inverse:
frames=torch.cat([frames, frames], dim=0).cuda() data=torch.cat([data, 1.0-data], dim=0).cuda()
returnframes, data
The reason why I modified this function is because when the hidden data is 32 bits with all 0 or 1, the model couldn't retreived the data well(=low accuracy). So, I let the model trained the data with bits of all 0 and 1 to increase the accuracy.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/Peachypie98/RivaGAN/issues/2#issuecomment-1891373902, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASNPWKEODBM7H3EJIAU5G7LYOTC5PAVCNFSM6AAAAABB2UCLO6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJRGM3TGOJQGI . You are receiving this because you commented.Message ID: @.***>
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Thanks for your help, I've finished solving this problem. But I still want to discuss some details with you.
Since the problem has been solved, I will close this issue.
运行了模型 下载了数据 跑了三百论 loss一直是1.3 精度也不高 只有0.6 左右 这种怎么调节去解决 改变了学习率那个loss也降不下