reinderien / mimic

[ab]using Unicode to create tragedy
MIT License
3.74k stars 102 forks source link

Added basic steganography support #41

Open grdaneault opened 8 years ago

grdaneault commented 8 years ago

This implements the steganography feature described in #28. It works as follows:

  1. The file to encode (specified by --encode) is read and converted to a bit stream
  2. During the mimicking process, mimicked characters represent one or more bits, depending on the amount of replacement options.
    • If there are 2 replacement options, one bit of information can be encoded in the character.
      • 0 is represented by the first option
      • 1 is represented by the second
      • If there are 3, still only one bit can be FULLY encoded.
    • With 4, 2 bits of data can be encoded, and so on.
      • 00 is represented by the first
      • 10 is represented by the third
      • ...
    • The number of bits that can be represented is int(log(len(options), 2))
    • There must be more than two options otherwise no bits can be encoded.
      • In this case the original character is passed through
  3. Each bit from the encode file is put into the output using this method
  4. The end of the data is marked by a character that is outside the normal encoding range.
    • If there are 3 replacements, then the 3rd would be used as it could not be used to otherwise represent a bit. The first two options are used to represent a 0 and a 1, but the third option cannot be used to encode data.
    • For 6 replacements, either the 5th or the 6th could be used since either would be otherwise unused
    • If there are exactly the number of replacements (2, 4, 8, ...), the original character is passed through and the next mimic attempt will include the stop character
  5. After all the input data has been encoded and a stop character has been inserted, the replacements go back to a random chance

This method is compatible with the --me-harder option (and is, in fact, likely necessary in order to hide information of any substantial size.)

In addition, this change also supports mimicking files passed in with the --source option rather than on stdin, and the tests have been updated to use nose, so they can be run using python setup.py test

reinderien commented 8 years ago

I have some concerns about this pull request, but I have to read more into it. The proper way to translate the input bit stream to the mimic 'options' is using a range encoder, which is not what's being done here. I suspect that the bit method here will not be as efficient as a range encoder.

fionafibration commented 5 years ago

.gitignore should probably be included in another PR