CameronLonsdale / lantern

Cryptanalysis library for breaking classical ciphers
MIT License
26 stars 4 forks source link

Substitution Cipher speed up #3

Closed CameronLonsdale closed 7 years ago

CameronLonsdale commented 7 years ago

Substitution cipher's current implementation is quite slow, I would like this to be sped up as much as is possible in python

CameronLonsdale commented 7 years ago
Fri Apr  7 13:59:35 2017    profiling_results

         174396648 function calls (174396336 primitive calls) in 148.498 seconds

   Ordered by: internal time
   List reduced from 300 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 38520856   52.282    0.000   55.925    0.000 /home/cameron/Documents/cckrypto/venv/lib/python3.5/site-packages/pycipher/base.py:16(a2i)
    90002   38.616    0.000   99.993    0.001 /home/cameron/Documents/cckrypto/venv/lib/python3.5/site-packages/pycipher/simplesubstitution.py:45(decipher)
    90000   30.695    0.000   41.569    0.000 /home/cameron/Documents/cckrypto/cckrypto/score_functions/ngram.py:28(__call__)
 38610000    5.765    0.000    5.765    0.000 /home/cameron/Documents/cckrypto/cckrypto/util.py:8(<genexpr>)
    90379    4.490    0.000   10.256    0.000 {method 'join' of 'str' objects}
 41340283    3.988    0.000    3.988    0.000 {method 'upper' of 'str' objects}
 47521056    3.845    0.000    3.845    0.000 {method 'isalpha' of 'str' objects}
        1    1.339    1.339  147.619  147.619 /home/cameron/Documents/cckrypto/cckrypto/modules/simplesubstitution.py:10(crack)
  2340052    1.051    0.000    1.051    0.000 {method 'index' of 'list' objects}
    90000    0.871    0.000    2.388    0.000 /home/cameron/Documents/cckrypto/venv/lib/python3.5/random.py:280(sample)

Profiling results indicate that the slowest functions are the decryption from pycipher and the ngram analysis. So these are the focus areas

CameronLonsdale commented 7 years ago

Faster as of 9f98b6f821e3310c353ac92a74c3a613166321f0 Bottleneck now is the ngram scoring function, which is already as optimised as possible for python. For faster speeds we might consider writing the ngram scorer in a faster language like D and then use bindings to call it