toncenter / tvm_valuetypes

Collection of utils for handling The Open Network Virtual Machine value types
Other
9 stars 10 forks source link

Refactor Cell.hash function to avoid aggressive recursion #12

Open shuva10v opened 1 year ago

shuva10v commented 1 year ago

Current implementation of Cell.hash() implies aggressive recursion processing refs. For some branchy cells it leads to serious performance issues. For example for such case 'te6ccsEBFwEAYAAEBQQFBQQFBAQEBAQEBAQEBAQFBAQEAwECyQEBBMsXAgECEgMBBMsXBAEEywcF\nAQIUBgEEywcHAQIBCAECJQkBAsgKAQKhCwECAQwBAiENAQIiDgECqA8BAgEQAQIhEQECARIBBIAD\nEwECqBQBAgEVAQIhFgACIX1S8I0=':

{'data': {'b64': 'yQ==', 'len': 8},
 'refs': [{'data': {'b64': 'yxc=', 'len': 16},
   'refs': [{'data': {'b64': 'Eg==', 'len': 8},
     'refs': [{'data': {'b64': 'yxc=', 'len': 16},
       'refs': [{'data': {'b64': 'ywc=', 'len': 16},
         'refs': [{'data': {'b64': 'FA==', 'len': 8},
           'refs': [{'data': {'b64': 'ywc=', 'len': 16},
             'refs': [{'data': {'b64': 'AQ==', 'len': 8},
               'refs': [{'data': {'b64': 'JQ==', 'len': 8},
                 'refs': [{'data': {'b64': 'yA==', 'len': 8},
                   'refs': [{'data': {'b64': 'oQ==', 'len': 8},
                     'refs': [{'data': {'b64': 'AQ==', 'len': 8},
                       'refs': [{'data': {'b64': 'IQ==', 'len': 8},
                         'refs': [{'data': {'b64': 'Ig==', 'len': 8},
                           'refs': [{'data': {'b64': 'qA==', 'len': 8},
                             'refs': [{'data': {'b64': 'AQ==', 'len': 8},
                               'refs': [{'data': {'b64': 'IQ==', 'len': 8},
                                 'refs': [{'data': {'b64': 'AQ==', 'len': 8},
                                   'refs': [{'data': {'b64': 'gAM=',
                                      'len': 16},
                                     'refs': [{'data': {'b64': 'qA==',
                                        'len': 8},
                                       'refs': [{'data': {'b64': 'AQ==',
                                          'len': 8},
                                         'refs': [{'data': {'b64': 'IQ==',
                                            'len': 8},
                                           'refs': [{'data': {'b64': 'IQ==',
                                              'len': 8},
                                             'refs': [],
                                             'special': False}],
                                           'special': False}],
                                         'special': False}],
                                       'special': False}],
                                     'special': False}],
                                   'special': False}],
                                 'special': False}],
                               'special': False}],
                             'special': False}],
                           'special': False}],
                         'special': False}],
                       'special': False}],
                     'special': False}],
                   'special': False}],
                 'special': False}],
               'special': False}],
             'special': False}],
           'special': False}],
         'special': False}],
       'special': False}],
     'special': False}],
   'special': False}],
 'special': False}

Current hash implementation leads to 25167033 function calls:

   25167033 function calls (12583868 primitive calls) in 4.252 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.252    4.252 {built-in method builtins.exec}
        1    0.000    0.000    4.252    4.252 <string>:1(<module>)
     23/1    0.000    0.000    4.252    4.252 cell.py:174(hash)
     23/1    0.000    0.000    4.252    4.252 cell.py:166(repr)
       22    0.000    0.000    4.251    0.193 cell.py:147(encoded_depth)
12582866/44    3.793    0.000    4.251    0.097 cell.py:138(depth)
12583006/12582960    0.459    0.000    0.459    0.000 {built-in method builtins.len}
...

After this patch it costs 702 calls:


         702 function calls (613 primitive calls) in 0.001 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.001    0.001 {built-in method builtins.exec}
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.001    0.001 cell.py:212(hash2)
     23/1    0.000    0.000    0.001    0.001 cell.py:183(__depth_level_hash)
     22/1    0.000    0.000    0.001    0.001 cell.py:192(<listcomp>)
       23    0.000    0.000    0.000    0.000 cell.py:88(top_upped_bytes)
       23    0.000    0.000    0.000    0.000 cell.py:159(bits_descriptor)
  162/116    0.000    0.000    0.000    0.000 {built-in method builtins.len}