tancik / StegaStamp

Invisible Hyperlinks in Physical Photographs
http://www.matthewtancik.com/stegastamp
MIT License
676 stars 191 forks source link

how to improve str_acc #12

Closed wuyang-dl closed 4 years ago

wuyang-dl commented 4 years ago

hi, can you give me some advise about: 1 how to improve str acc? I use win10 + tf cpu to train a model with the same image dataset you offered. image perhaps the iterate does not reach its maximum num(14w, now is about 1.9w)

2 and are the parameters in parser.add_argument(in train.py) correct? I need some details about these paras, to make sure whether I am on the right way training model.

3 can you tell me the ideal about these tf encode/decode/discriminator models, why do we need these 10 or 5 CNN layers? and why can these models achieve a good performance?

4 and can you tell me the BCH code? in code, you use BCH_POLYNOMIAL = 137, BCH_BITS = 5. this BCH code need 127bits(92bits are valid bits that we can hide data). But in parper, you use 100bits(96bits+4bits) which can recover 56 bits data.

many thanks!

tancik commented 4 years ago
  1. I have not tested with win10 + tf cpu. The model is designed to be trained with a gpu, using a cpu may take a very long time. To match the performance in the paper, the model needs to be trained for 140k iterations.

  2. The parameters listed here match the paper.

  3. To learn more about the design decisions take a look at the corresponding research paper here

  4. In the provided code we use 100 bits to hide 56 bits. The related code is here. It is possible to encode longer messages with these BCH parameters. Given that our StegaStamps were trained to hide 100 bits, these were the best BCH parameters we could find. In retrospect we should have trained the algorithm to hide 127 bits (as you mention above) for better efficiency.

wuyang-dl commented 4 years ago

Thanks! I have an older version of your paper, a newer version you offered is really helpful! I will have a deep look at this paper.

ps. Do you have any ideas about how to encode doc images, I find it is really difficult to hide info(I think doc image is too simple, info is not easy to decode or encode). the accuracy of decoding doc image is really low.

Thanks again!

tancik commented 4 years ago

When you say "doc images" are you referring to text documents? If so I would not expect this method to perform well in this situation. The method works best with natural images where the data can be hidden in the textures.

wuyang-dl commented 4 years ago

ok, thanks

zimenglan-sysu-512 commented 3 years ago

hi @tancik i train a modeil with 140k iterations, and find that the bit_acc can reach 80%~90%, but the str_acc only get 30%~40%, is that normal? image

tancik commented 3 years ago

That seems reasonable. Keep in mind that the string accuracy reported in the tensorboard does not use error correcting codes.

zimenglan-sysu-512 commented 3 years ago

thanks @tancik btw, if i train a larger dataset, like mirflickr (100w), how to change the training settings? like batchsize, learning rate, steps, each ramp?

tancik commented 3 years ago

I haven't tried a larger dataset, but my guess is that the same parameters would work fine.

zimenglan-sysu-512 commented 3 years ago

ok, thanks.

zimenglan-sysu-512 commented 3 years ago

hi @tancik i train the mirflickr (100w) with the same settings as paper, except the step changing from 14w to 140w, the result as below: image it seems that training a model with larger dataset and long iterations is better.

tancik commented 3 years ago

Very interesting! Good to know.