Closed billbrod closed 2 years ago
After reviewing some of the open issues, I noticed the following comment in #96:
A note on something that Kate and I found: our PS texture metamer synthesis is broken by the changes in torch 1.10 (that is, 1.8 and 1.9 give the same result, but 1.10 gives a different one). After looking into it for a while with Nikhil, we're pretty sure it's because they changed the sub-gradient for
torch.min
andtorch.max
(nothing else seems possibly relevant). The forward pass ofPortillaSimoncelli
is unaffected, so it has to be something to do with autograd. The synthesis refactor (#136) also changed the outputs of metamer synthesis. And these two changes appeared to interact in some way -- the differences get larger when using the refactoredMetamer
with torch 1.10 than with either change alone. The arguments required in order to get good-looking synthesis, as used in our test, changed as the result of this, but they still appear to be good enough for us.
Originally posted by @billbrod in https://github.com/LabForComputationalVision/plenoptic/issues/96#issuecomment-973318291
So we've had this happen before.
Strangely enough, it doesn't look like the Jenkins tests are failing (and they are using torch 1.12.1), so I'm going to double-check what happens on the GPU
Turns out that the GPU metamers are identical between torch 1.12 and 1.11, so that's fun.
Some change between pytorch version
1.11
and1.12.0
changed the outputs of our PS texture metamer test. (We don't support1.12.0
due to #162 , but this problem persists in1.12.1
). Thus, our tests for this fail. It only happens for python3.7
and3.8
, because3.6
doesn't support past pytorch1.10
(see #156 , we'll drop support for3.6
as well).The results still look good (top is
<=1.11
, bottom is1.12.0
):This is with the same seed and setting
torch.use_deterministic_algorithms(True)
, but pytorch doesn't guarantee reproducibility across versions or systems: "Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms." (source).So I think we should probably upload a new batch of files to the OSF and choose which version to test against based on the pytorch version. That's inelegant, but means we know when a version change has affected things. The other way to do it would be to pick a loss value we think is "good enough" and just check against that our synthesis loss is under that, but then we're not guaranteeing reproducibility and are less likely to notice if a change affects things significantly.