Open ghost opened 2 years ago
@Woodley-Griffith Hi, very grateful for you to find out the problem in my code.
torchgeometry.warp_perspective
transforms the source image using the specified transformation matrix. It seems some small fluctuation will make the transformed result overflow the range [0, 1]. I agree that it is necessary to add an torch.clamp
, which is more reasonable and careful. More merge requests are very welcome, to make the code work better.@Woodley-Griffith Nice suggestion and I also agree with what you saying, is there any update about this issue?
I can not find a commit about this issue.
@sunutf not update and test yet
@Woodley-Griffith Hi, very grateful for you to find out the problem in my code.
1. The function `torchgeometry.warp_perspective` transforms the source image using the specified transformation matrix. It seems some small fluctuation will make the transformed result overflow the range [0, 1]. I agree that it is necessary to add an `torch.clamp`, which is more reasonable and careful. More merge requests are very welcome, to make the code work better. 2. I remember that I have never encountered this situation. This is so strange. So, according to what you say, sometimes the training is successful, and sometimes the bit accuracy fluctuate around 0.5, and training is fail? It seems that maybe the initialization of model weight is to blame? I think your solution can solve the problem temporarily, but a more reasonable solution needs to be found.
The problem related to accuracy is directly related to the weights initialization and the spatial transformer block. I just removed them, and it works well. However, we still need to deal with warping.
@lschirmer Hi there, your reply was very helpful. Just to clarify, did you mean you removed the entire spatial transformer block, like removing this?
# in class StegaStampDecoder(nn.Module):
self.stn = SpatialTransformerNetwork()
...
transformed_image = self.stn(image)
The original paper seems to emphasize using a spatial transformer: "A spatial transformer network [24] is used to develop robustness against small perspective changes that are introduced while capturing and rectifying the encoded image."
Hello, I've been studying your code these days. I am very grateful for you because the code is very helpful to me, but I found some small problems in the training process:
model.py
https://github.com/JisongXie/StegaStamp_pytorch/blob/184642e61d2fa95bcaf4131e64963202267c875d/model.py#L264-L290 which will make the pixel value of encoded_image exceed the boundary (0,1), which will have the image producing noise when displaying, for example:Since the image trained by the program itself allows a certain color gamut distortion, I think it is necessary to add an
torch.clamp
sentence to reduce it to the boundary of (0,1) before being input into the decoded net part, and if you agree, I would like to submit a merge request about this.