Open kiya00 opened 6 months ago
triage review — we should review this as part of a larger reproducible randomness discussion
Trying to produce the same random numbers as PyTorch is probably a non-goal. Trying to produce the same random numbers regardless of executor might be a goal.
Both of these results are with nvFuser, the issue being that segmentation and ordering of the runtime could depend on what is or is not an output. So the question seems to be if we should try to guarantee if the only delta between a nvFuser graphs is the marked outputs, should we generate the same RNG per tensor.
🐛 Bug
The same function outputs different values when the input tensor is the same but
requires_grad
isTrue/False
.note: if change the last line in
func
to bereturn f+d
, the outputs are the same as expected. torchex doesn't have the problemTraces:
cc @apaz-cli