bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.79k stars 426 forks source link

[tests/test_models] The results of `transform_bounds` are inconsistent between CPU and GPU. #692

Closed enderdzz closed 2 years ago

enderdzz commented 2 years ago

Describe the bug

When I test this project with this command: pytest --pdb --cov=foolbox --cov-append --backend pytorch

it occurs errors in test_transform_bounds[pytorch_shufflenetv2-bounds1] case:

tests/test_models.py::test_bounds[pytorch_shufflenetv2] PASSED                                                                                                                                           [ 81%]
tests/test_models.py::test_forward_unwrapped[pytorch_shufflenetv2] PASSED                                                                                                                                [ 81%]
tests/test_models.py::test_forward_wrapped[pytorch_shufflenetv2] PASSED                                                                                                                                  [ 81%]
tests/test_models.py::test_transform_bounds[pytorch_shufflenetv2-bounds0] PASSED                                                                                                                         [ 81%]
tests/test_models.py::test_transform_bounds[pytorch_shufflenetv2-bounds1] FAILED                                                                                                                         [ 81%]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

fmodel_and_data = (<foolbox.models.pytorch.PyTorchModel object at 0x7f8500917c40>, PyTorchTensor(tensor([[[[0.7490, 0.7569, 0.8157,  ......                   device='cuda:0')), PyTorchTensor(tensor([243, 559, 438, 990, 949, 853, 609, 609], device='cuda:0')))
bounds = (-1.0, 1.0)

    @pytest.mark.parametrize("bounds", [(0, 1), (-1.0, 1.0), (0, 255), (-32768, 32767)])
    def test_transform_bounds(
        fmodel_and_data: ModelAndData, bounds: fbn.types.BoundsInput
    ) -> None:
        fmodel1, x, y = fmodel_and_data
        logits1 = fmodel1(x)
        min1, max1 = fmodel1.bounds

        fmodel2 = fmodel1.transform_bounds(bounds)
        min2, max2 = fmodel2.bounds
        x2 = (x - min1) / (max1 - min1) * (max2 - min2) + min2
        logits2 = fmodel2(x2)

>       np.testing.assert_allclose(logits1.numpy(), logits2.numpy(), rtol=1e-4, atol=1e-4)
E       AssertionError:
E       Not equal to tolerance rtol=0.0001, atol=0.0001
E
E       Mismatched elements: 23 / 8000 (0.287%)
E       Max absolute difference: 0.00045848
E       Max relative difference: 0.0591841
E        x: array([[ 0.95968 , -0.457812,  1.873025, ...,  0.705677,  1.764964,
E                3.499288],
E              [-4.896469, -1.459234,  2.982488, ..., -3.16199 , -0.103394,...
E        y: array([[ 0.959775, -0.457747,  1.873144, ...,  0.705679,  1.764929,
E                3.499305],
E              [-4.896469, -1.459234,  2.982488, ..., -3.16199 , -0.103394,...

tests/test_models.py:107: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB post_mortem (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /home/xxx/foolbox/tests/test_models.py(107)test_transform_bounds()
-> np.testing.assert_allclose(logits1.numpy(), logits2.numpy(), rtol=1e-4, atol=1e-4)
(Pdb)

To Reproduce

Use GPU, export CUDA_VISIBLE_DEVICES="0"

Minimized script:

from typing import Tuple, Any
import eagerpy as ep
import numpy as np

import foolbox as fbn
ModelAndData = Tuple[fbn.Model, ep.Tensor, ep.Tensor]

def mock() -> ModelAndData:
    import torchvision.models as models

    model = models.shufflenet_v2_x0_5(pretrained=True).eval()
    preprocessing = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], axis=-3)
    fmodel = fbn.PyTorchModel(model, bounds=(0, 1), preprocessing=preprocessing)

    x, y = fbn.samples(fmodel, dataset="imagenet", batchsize=8)
    x = ep.astensor(x)
    y = ep.astensor(y)

    return fmodel, x, y

def test_transform_bounds(
    fmodel_and_data: ModelAndData, bounds: fbn.types.BoundsInput
) -> None:
    fmodel1, x, y = fmodel_and_data
    logits1 = fmodel1(x)
    min1, max1 = fmodel1.bounds

    fmodel2 = fmodel1.transform_bounds(bounds)
    min2, max2 = fmodel2.bounds
    x2 = (x - min1) / (max1 - min1) * (max2 - min2) + min2
    logits2 = fmodel2(x2)

    np.testing.assert_allclose(logits1.numpy(), logits2.numpy(), rtol=1e-4, atol=1e-4)

    # to make sure fmodel1 is not changed in-place
    logits1b = fmodel1(x)
    np.testing.assert_allclose(logits1.numpy(), logits1b.numpy(), rtol=2e-6)

    fmodel1c = fmodel2.transform_bounds(fmodel1.bounds)
    logits1c = fmodel1c(x)
    np.testing.assert_allclose(logits1.numpy(), logits1c.numpy(), rtol=1e-4, atol=1e-4)

fmodel_and_data = mock()

for bound in [(0, 1), (-1.0, 1.0), (0, 255), (-32768, 32767)]:
    test_transform_bounds(fmodel_and_data, bound)

Expected behavior

This test case should be passed with cuda.

Or should the test threshold be adjusted a bit, i.e. rtol=1e-4, atol=1e-4?

Software (please complete the following information):

Additional context

I also tried torchvision.models.shufflenet_v2_x1_0 and torchvision.models.mobilenet_v2, neither of them passed this test.

However, when I set export CUDA_VISIBLE_DEVICES="", not use the GPU, this test case is PASSED.

zimmerrol commented 2 years ago

Thanks for reporting this! That's indeed an unexpected and weird behavior. I'm a bit confused at the moment why the resulting logits do not match, as it appears like the inputs that in the end get passed to the actual model (so after the entire pre-processing thingy is done) should match exactly.

enderdzz commented 2 years ago

Thanks for your reply, I've figured out the problem : )

When using a cuda device you need to ensure that the data precision type is torch.float64, i.e. torch.cuda.DoubleTensor, so that the resulting precision error does not exceed the threshold. Although in the cpu case, both float32 and float64 can pass this test.

Ref: https://pytorch.org/docs/stable/tensors.html

zimmerrol commented 2 years ago

Alright. As this doesn't seem to indicate any issue with the functionality of the transform I'll close this issue for now.