mcvine / acc

Accelerated mcvine engine
0 stars 1 forks source link

Add type annotations to CUDA functions in straight guide and drop to 32-bit precision. #44

Closed mtbc closed 2 years ago

mtbc commented 2 years ago

Adds explicit rather than inferred type signatures to make it easier to experiment with 32-bit imprecision. With this in, all one need do for such an experiment is:

  1. Change all the float64 to float32.
  2. Before taking t2 and after taking t3, use neutron_array.astype(np.float32) and neutron_array.astype(np.float64) to convert the array of neutrons.
  3. Check the types using something like,
    with open("/tmp/types", "w") as out:
    process_kernel.inspect_types(file=out)
mtbc commented 2 years ago

With the 32-bit version of the straight guide, for the tests' np.allclose I needed to downgrade the default atol by a couple of orders of magnitude. Presumably this gets even worse for a more numerically complex path.

mtbc commented 2 years ago

As one might expect, without the described changes, this PR doesn't much change execution speed nor register count.

mtbc commented 2 years ago

In experiments this morning, I'm finding that the described steps to drop to 32-bit slightly increase register count while improving speed by more than a third, so I'll add a commit here to do that and we'll see if the CI tests still pass.