nccgroup / featherduster

An automated, modular cryptanalysis tool; i.e., a Weapon of Math Destruction
BSD 3-Clause "New" or "Revised" License
1.09k stars 128 forks source link

Vigenere module is broken #60

Closed faneca closed 7 years ago

faneca commented 7 years ago

I tried cloning the repo and running a simple autopwn on some samples. Everytime I get this:

(...)
Running module: vigenere
Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 102, in loop
    self.walk_and_run(input_)
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 69, in walk_and_run
    self.walk(command, 0, run=True, full_line=command)
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 54, in walk
    return cmd.complete(line_commands[1:], buf, state, run, full_line)
  File "build/bdist.linux-x86_64/egg/ishell/command.py", line 69, in complete
    return self.run(full_line.rstrip())
  File "featherduster.py", line 236, in run
    print feathermodules.module_list[attack]['attack_function'](feathermodules.samples)
  File "/home/roman/bin/packages/re/featherduster/feathermodules/classical/vigenere.py", line 16, in break_vigenere
    coefficient_word_count=float(options['coefficient_word_count']))
  File "/home/roman/bin/packages/re/featherduster/cryptanalib/classical.py", line 325, in break_vigenere
    key_lengths = evaluate_vigenere_key_length(ciphertext, scan_range)[:num_key_lengths]
  File "/home/roman/bin/packages/re/featherduster/cryptanalib/classical.py", line 237, in evaluate_vigenere_key_length
    key_length_best_guesses = map(list, zip(*ioc_best_guesses))[0]
IndexError: list index out of range
unicornsasfuel commented 7 years ago

@faneca Can you please attach the samples and show your workflow to reach this error? I've reproduced this error with a short ciphertext and I have a fix, but I'd like to make sure that your particular issue is addressed by this fix.

I intend to put checks in the vigenere module that prevent these out-of-range indices from being used, as well as requiring the ciphertext to be of a certain length, because auto-solving vigenere ciphertexts smaller than a certain length becomes statistically VERY unlikely.

unicornsasfuel commented 7 years ago

@faneca A proposed fix is pushed, if you do not want to reveal the samples you used (they're sensitive in nature) you can pull the fix and let me know if it addresses your issue.

If I don't hear back from you in a week or two, I'll assume that this fixes your issue and close this.

Thanks for reporting bugs! :)

faneca commented 7 years ago

Yep, I'd rather not reveal the samples to avoid any repercussions, but they were pretty short (23 and 64 bytes were the most common lengths) so it's very likely that was exactly the issue.

For some reason, when I try to reproduce the bug again (to be clear, I haven't updated my local repository yet), I'm getting a different result: Vigenère is no longer among the modules selected by autopwn O_o; My guess is I'm picking different samples than the ones used the other day (but that's weird anyway, IIRC that shouldn't be the case). I'll try to successfully replicate the issue and then fetch the proposed fix to see if it makes any difference, but if you don't hear from me in the next few days, please go ahead and consider this closed. Thanks for your time!

unicornsasfuel commented 7 years ago

Can you post the workflow you used to import the samples? I'm afraid that my tests only reproduced this issue when the ciphertext was <= 10 chars long, so I think there may be something else at play here.

faneca commented 7 years ago

Ok, using the right sample, it breaks as described (I was indeed trying to reproduce the error with the wrong samples b/c of a dumb mistake).

The problem is the analyze phase interprets the sample as a base64 encoded one, so a 24 printable character string gets decoded into a 18 bytes long binary blob (which is then handled to the vigenere module at some point).

I managed to create a reproducible case with a non-sensitive sample: it was actually created ad-hoc just by monkey-typing and then making some easy adjustments (a simpler, more trivial sample -say, 24 A's- wouldn't trigger the base64 decoding): 0ajsdfoqnfsp0aa8j3sdkjaA

Just importing it and launching autopwn would be enough to trigger the processing sequence described (base64 -> vigenere). This would be the one-liner to reproduce the bug:

tee >(python featherduster.py <(echo -n '0ajsdfoqnfsp0aa8j3sdkjaA')) <<<'autopwn'

But to avoid wasting some CPU cycles you can replace the autopwn command with analyze, use vigenere and run. Here you have the complete session:

$ python featherduster.py <(echo -n '0ajsdfoqnfsp0aa8j3sdkjaA')

Welcome to FeatherDuster!

To get started, use 'import' to load samples.
Then, use 'analyze' to analyze/decode samples and get attack recommendations.
Next, run the 'use' command to select an attack module.
Finally, use 'run' to run the attack and see its output.

For a command reference, press Enter on a blank line.

FeatherDuster> analyze
[+] Analyzing samples...
[+] Messages appear to be Base64 encoded, Base64 decoding and analyzing again.
[+] Messages may be encrypted with a stream cipher or simple XOR.
[!] Individual messages have failed statistical tests for randomness.
[!] This suggests weak crypto is in use.
[!] Consider running single-byte or multi-byte XOR solvers.

[+] Suggested modules:
   alpha_shift          - A brute force attack against an alphabetic shift cipher. 
   base_n_solver        - A solver for silly base-N encoding obfuscation.          
   single_byte_xor      - A brute force attack against single-byte XOR encrypted ciphertext.
   multi_byte_xor       - A brute force attack against multi-byte XOR encrypted ciphertext.
   many_time_pad        - A statistical attack against keystream reuse in various stream ciphers.
   vigenere             - A module to break vigenere ciphers using index of coincidence for key length detection and frequency analysis.

FeatherDuster> use vigenere

FeatherDuster> run
Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 102, in loop
    self.walk_and_run(input_)
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 69, in walk_and_run
    self.walk(command, 0, run=True, full_line=command)
  File "build/bdist.linux-x86_64/egg/ishell/console.py", line 54, in walk
    return cmd.complete(line_commands[1:], buf, state, run, full_line)
  File "build/bdist.linux-x86_64/egg/ishell/command.py", line 69, in complete
    return self.run(full_line.rstrip())
  File "featherduster.py", line 288, in run
    feathermodules.results = feathermodules.selected_attack['attack_function'](feathermodules.samples)
  File "/home/roman/bin/packages/re/featherduster/feathermodules/classical/vigenere.py", line 16, in break_vigenere
    coefficient_word_count=float(options['coefficient_word_count']))
  File "/home/roman/bin/packages/re/featherduster/cryptanalib/classical.py", line 325, in break_vigenere
    key_lengths = evaluate_vigenere_key_length(ciphertext, scan_range)[:num_key_lengths]
  File "/home/roman/bin/packages/re/featherduster/cryptanalib/classical.py", line 231, in evaluate_vigenere_key_length
    ioc_median = ioc_median[len(ioc_list)/2]
IndexError: list index out of range

$ 

I'm now going to fetch the new code and see what happens. I'll keep you informed

faneca commented 7 years ago

Ok, now I'm getting this:

...
Running module: vigenere
[*] Skipping sample, too short to solve statistically
[*] Module execution failed, please report this issue at https://github.com/nccgroup/featherduster/issues
...
unicornsasfuel commented 7 years ago

Hmm, yeah. I added a try/except to attempt to stop module errors from crashing FD, but I don't like that it suppresses the traceback entirely... I'll have to address that. The vigenere module seems to fail to handle short ciphertexts correctly, strips non-alphanumeric characters before processing, and after base64 decoding and stripping, the ciphertext 0ajsdfoqnfsp0aa8j3sdkjaA you provided seems to boil down to u6. The patch seems to partially address this, noting that the sample is too short to be properly processed, but then seems to cause some exception regardless, resulting in the "module execution failed" message which I am now cursing for being so generic.

global4g commented 7 years ago

I just cloned the latest and ran with the same ciphertext and was able to see the key and did not get any error.

$ python featherduster.py <(echo -n '0ajsdfoqnfsp0aa8j3sdkjaA') Welcome to FeatherDuster!

To get started, use 'import' to load samples. Then, use 'analyze' to analyze/decode samples and get attack recommendations. Next, run the 'use' command to select an attack module. Finally, use 'run' to run the attack and see its output.

For a command reference, press Enter on a blank line.

FeatherDuster> use vigenere

FeatherDuster> run Key found for sample 1: "SWEA". Decrypts to: 0inodnsmnnwl0ai8n3odsnwA

Secondly, I ran with another cipher text and got the result however it doesn't seem to be correct.

$ python featherduster.py <(echo -n 'p xasc. a zdmik qtng. yiy uist. easc os iye iq trmkbumk. gwv wolnrg kaqcs vi rlr.') Welcome to FeatherDuster!

To get started, use 'import' to load samples. Then, use 'analyze' to analyze/decode samples and get attack recommendations. Next, run the 'use' command to select an attack module. Finally, use 'run' to run the attack and see its output.

For a command reference, press Enter on a blank line.

FeatherDuster> use vigenere

FeatherDuster> run Key found for sample 1: "TGAKJGK". Decrypts to: w rait. u pkgia hndn. sio lcia. yait ii pse yh nhtebkde. wdp wechhn eagtm lp llh.

When I use the same ciphertext on this website https://www.guballa.de/ vigenere-solve, it shows the plaintext as

"a game. a movie star. his wife. name of the cs textbook. the winner takes it all."

Let me know if I'm missing something and/or if I could be of any help.

On Mon, Mar 27, 2017 at 6:32 PM, faneca notifications@github.com wrote:

Ok, now I'm getting this:

... Running module: vigenere [] Skipping sample, too short to solve statistically [] Module execution failed, please report this issue at https://github.com/nccgroup/featherduster/issues ...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nccgroup/featherduster/issues/60#issuecomment-289635705, or mute the thread https://github.com/notifications/unsubscribe-auth/ATR9dZj8_VdFFHKtduC1n8i-doqvzIgJks5rqGMcgaJpZM4Mo5yj .

faneca commented 7 years ago

You forgot to run analyze before use vigenere (thus triggering the base64 decoding; by using autopwn, this happens automatically).

global4g commented 7 years ago

ok; after "analyze" I got the error trap [*] Skipping sample, too short to solve statistically

my other question still remains.

On Mon, Mar 27, 2017 at 7:03 PM, faneca notifications@github.com wrote:

You forgot to run analyze before use vigenere (thus triggering the base64 decoding; by using autopwn, this happens automatically).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nccgroup/featherduster/issues/60#issuecomment-289640357, or mute the thread https://github.com/notifications/unsubscribe-auth/ATR9dX8gGJrTLxMIJ0yBnpgfyJ05g10jks5rqGqNgaJpZM4Mo5yj .

unicornsasfuel commented 7 years ago

Okay, looks like the mistake was a very silly one, the return value of False was causing issues because the vigenere feathermodule tried to iterate through False which of course is not going to work. This should address the short-ciphertext failure condition of the vigenere module.

Not sure why the module is failing to correctly solve your vigenere problem @global4g, I will defer to @stocyr on this as he was the author of this module and I haven't dug too much into his solver to understand exactly how he approached the problem. Is guballa.de's solver open source? We may consider running some tests to see if the method is more frequently successful and adopt it if so.

global4g commented 7 years ago

not sure if that site is open source but I saw this link which is interesting. https://www.guballa.de/implementierung-eines-vigenere-solvers ( its in German though )

On Mon, Mar 27, 2017 at 7:08 PM, unicornsasfuel notifications@github.com wrote:

Okay, looks like the mistake was a very silly one, the return value of False was causing issues because the vigenere feathermodule tried to iterate through False which of course is not going to work. This should address the short-ciphertext failure condition of the vigenere module.

Not sure why the module is failing to correctly solve your vigenere problem @global4g https://github.com/global4g, I will defer to @stocyr https://github.com/stocyr on this as he was the author of this module and I haven't dug too much into his solver to understand exactly how he approached the problem. Is guballa.de's solver open source? We may consider running some tests to see if the method is more frequently successful and adopt it if so.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nccgroup/featherduster/issues/60#issuecomment-289640984, or mute the thread https://github.com/notifications/unsubscribe-auth/ATR9deCVU7wHzmWrHiCg3p_s7oQdnZCJks5rqGubgaJpZM4Mo5yj .

unicornsasfuel commented 7 years ago

@global4g It's worth mentioning that solvers like this one are based on frequency analysis and it is normal for them to fail for samples where the frequency deviates too much from what is expected of the english language, or whatever other source material one is working with. The only functional difference between their solver and the one in FeatherDuster may be the frequency data they're using to find the best key.

global4g commented 7 years ago

Yes, that makes sense!

On Mon, Mar 27, 2017 at 7:13 PM, unicornsasfuel notifications@github.com wrote:

@global4g https://github.com/global4g It's worth mentioning that solvers like this one are based on frequency analysis and it is normal for them to fail for samples where the frequency deviates too much from what is expected of the english language, or whatever other source material one is working with. The only functional difference between their solver and the one in FeatherDuster may be the frequency data they're using to find the best key.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nccgroup/featherduster/issues/60#issuecomment-289641634, or mute the thread https://github.com/notifications/unsubscribe-auth/ATR9dX1soeTRPe2uot99akjnYsvWylq5ks5rqGy3gaJpZM4Mo5yj .

unicornsasfuel commented 7 years ago

Amusingly, the ciphertext mentioned in the writeup of the guballa.de vigenere solver:

VVRQI EREOY LDPTT MWNFL ECKAV MZPWE EHRZK UHXHI KCISC BGBZH LHEPK DSERK AEESJ KOLIF 
 ZJKHB SXSZK SALUA ZPGVX EOKIX OCEIQ VHBHF HWFJI MITSP XHCZS JTYWH VTRSW KVMSG QTKSY 
 WYMOF XQPSH IGSOH GMVXC ITPKW YZXAH JVRSK ZWGXT RMTXW AGFDV IQGTK SVXEM OMFWN OFOR 

Is not solved correctly by their Vigenere solver ;)

unicornsasfuel commented 7 years ago

...nor by ours ;_;

jensguballa commented 7 years ago

To solve that cipher on guballa.de you should probably increase the key length (e.g. 3-100). :-P

unicornsasfuel commented 7 years ago

Since the original bug is fixed, I'm going to close this issue.

I don't think the vigenere module is outright broken since it does solve some vigenere problems (it passes the unit test, for instance) but I could see improving its success rate as an enhancement if anyone has specific criticisms of how the module works and would be open to another issue being opened with guidance on how to improve the accuracy of the vigenere module.