NVlabs / nvdiffrast

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
Other
1.37k stars 146 forks source link

Reproducibility #168

Closed shirleykei closed 6 months ago

shirleykei commented 7 months ago

Hi, Nvdiffrast optimization results seem to be not reproducible. I tried adding torch.manual_seed(0) to the code but this didn't solve the issue. Andy idea how I can make nvdiffrast results repetitive? thanks

s-laine commented 7 months ago

Many of the operations, especially the gradient passes, use Cuda atomics to accumulate gradients. These are inherently nondeterministic, and there is no simple way to change that. The forward passes of rasterization, interpolation, and texturing should be deterministic because they use no atomics.

Most of the time, the differences between atomic execution order should yield quite small differences in the results, though. If you're seeing large deviations between executions, that is probably caused by something else.

shirleykei commented 7 months ago

Thanks @s-laine is it possible to set a fixed seed on Cuda computations?

s-laine commented 7 months ago

The problem is not about random seed. The issue is that when many Cuda threads are run concurrently, the order in which they proceed in not deterministic. Thus, the atomic operations are executed in an order that cannot be predicted or controlled.

For some types of atomic operations, the final result is unaffected by the order in which they are executed (for example, integer addition). However, floating-point addition which is used a lot in the gradient passes does not necessarily produce the same result when executed in a different order. For example, (a + b + c) may produce a slightly different result from (a + c + b) due to rounding, and there is no way to avoid this.

shirleykei commented 7 months ago

Thank you for the detailed explanation