evanrichter / cipher-project-1

Applied Cryptography Project 1
0 stars 0 forks source link

ability to spell check a very close plaintext to perfectly plausible plaintext #21

Closed evanrichter closed 3 years ago

evanrichter commented 3 years ago

since we know the exact wordlist used, we should be able to take a very close plaintext and use "spell checking" to correct the few words that don't quite match, to exact words found in the wordlist.

even better would be to do a few "spell checks", note what index from the key was used to correct, and try to apply that pattern to the rest, seeing if that helps more words match automatically. this technique would need to know info about the assumed keylength that was guessed in previous steps.

this function would be applied after #20, but can be developed and tested in parallel. For testing, use keys that are mostly zeros, for example: [ 0, 0, 0, 0, 0, 0, -2, 0, 0, 1, 0, 0 ] with a simple RepeatingKey schedule, to simulate a "close" plaintext. Also throw in a light PeriodicRand in some tests.

evanrichter commented 3 years ago

ready to go for @amp813 ! See the spell-check branch

evanrichter commented 3 years ago

a helpful library for various string similarity functions: strsim

I made a single word spell checker using Levenshtein "edit" distance in this dictionary method:

https://github.com/evanrichter/cipher-project-1/blob/b495829db7f61bacd67ae9de7bc881679285a29f/src/dict.rs#L49-L64

evanrichter commented 3 years ago

@amp813 have you taken a look at the file words/plaintext_dictionary_test1.txt? It looks like we know 5 possible plaintexts in their entirety, not just the wordlist.

If test 1 must encrypt one of these five plaintexts, then you should make a special case if the near-plaintext is close to these, then pick one exactly.

I can make helper functions to parse out the sentences from this file, and make them available.