discuss what CUDA does to avoid branching for max/min
Maybe give a tiny refresher on IEEE:
import struct
def f(val):
return ''.join([bin(ord(c))[2:].zfill(8) for c in struct.pack('>d', val)])
def g(bits):
z = [bits[i:i + 8] for i in xrange(0, 64, 8)]
z = [int(c, 2) for c in z]
z = ''.join([chr(c) for c in z])
return struct.unpack('>d', z)[0]
def h(s, e, m):
return g(s + e + m)
h('0', '1' * 11, '1' + '0' * 51)
https://gist.github.com/dhermes/c79846c6074b938b2e10
Also
Maybe give a tiny refresher on IEEE: