kstenerud / safe-encoding

Binary-to-text encoding that is safe to pass through modern text processors
Other
32 stars 5 forks source link

Shortcuts for division modulo 85 #4

Open DonaldTsang opened 5 years ago

DonaldTsang commented 5 years ago

Since 85*3 = 255, why not divmod 255 first? My idea would be

def div255(x):
  q, r = 0, x
  while r > 255:
    q, r = q + (r << 8), (r & 255) + (r << 8)
  return q, r

def fast255(x):
  # first round, r will be smaller than or equal to 4 * 255
  q = (((x >> 24) & 255) << 16) + (((x >> 24) & 255) << 8) + \
  ((x >> 24) & 255) + (((x >> 16) & 255) << 8) + ((x >> 16) & 255) + ((x >> 8) & 255)
  r = ((x >> 24) & 255) + ((x >> 16) & 255) + ((x >> 8) & 255) + (x & 255)
  # second round, r will be smaller than or equal to 255
  q += (((r >> 16) & 255) << 8) + ((r >> 16) & 255) + ((r >> 8) & 255)
  r = (r >> 8) + (r & 255)
  return q, r

def div85(x):
  qx, rx = div255(x)
  q, r = qx * 3 , rx
  while r > 85: # r will never be larger than 255
    q, r = q + (r << 7), (r & 127) + 43 * (r << 7)
  return q, r

def fast85(x):
  # to be written

Would this make the calculation faster (if implemented in C)?

kstenerud commented 5 years ago

Hmm those look interesting!

TBH I don't know which would be faster. In traditional architectures, this would be faster, but with the instruction scheduling and specialized math units these days, there'd be no way to tell without actually profiling...

DonaldTsang commented 4 years ago

Take a look at https://github.com/ridiculousfish/libdivide