pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
84.7k stars 22.81k forks source link

test_del (jit.test_builtins.TestBuiltins) fails due to highlight assertions #72516

Open lexming opened 2 years ago

lexming commented 2 years ago

🐛 Describe the bug

test_del in jit.test_builtins.TestBuiltins runs as expected but the assertion by highlight of the result fails with error message RuntimeError: Expected to find "a"highlighted but it is not. even though the a is correctly highlighted in the output.

======================================================================
ERROR: test_del (jit.test_builtins.TestBuiltins)
----------------------------------------------------------------------
RuntimeError: 
undefined value a:
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/zen2/build/PyTorch/1.10.0/foss-2021a-CUDA-11.3.1/pytorch/test/jit/test_builtins.py", line 94
                a = x ** 2
                del a
                return a
                       ~ <--- HERE

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/zen2/build/PyTorch/1.10.0/foss-2021a-CUDA-11.3.1/pytorch/test/jit/test_builtins.py", line 91, in test_del
    def fn(x):
  File "/user/brussel/101/vsc10122/easybuild/install/zen2/tmp/eb-Pu4WSO/tmpOhRMcx/lib/python3.9/site-packages/torch/testing/_internal/jit_utils.py", line 92, in __exit__
    FileCheck().check_source_highlighted(self.highlight).run(str(value))
RuntimeError: Expected to find "a"highlighted but it is not.

undefined value a:
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/zen2/build/PyTorch/1.10.0/foss-2021a-CUDA-11.3.1/pytorch/test/jit/test_builtins.py", line 94
                a = x ** 2

----------------------------------------------------------------------

These test do pass if assertRaisesRegexWithHighlight() is replaced by a regular string match with assertRaisesRegex().

Versions

Collecting environment information...
PyTorch version: 1.10.2
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 7.9.2009 (Core) (x86_64)
GCC version: (GCC) 10.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.17

Python version: 3.9.5 (default, Sep 27 2021, 19:51:54)  [GCC 10.3.0] (64-bit runtime)
Python platform: Linux-3.10.0-1160.49.1.el7.x86_64-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 11.3.109
GPU models and configuration: 
GPU 0: NVIDIA A100-PCIE-40GB
GPU 1: NVIDIA A100-PCIE-40GB

Nvidia driver version: 470.82.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.3
[pip3] torch==1.10.2
[conda] Could not collect
davidberard98 commented 2 years ago

@lexming thanks for the report - unfortunately, I'm not able to reproduce this.

Could you try running with TORCH_SHOW_CPP_STACKTRACES=1 to see if we can get some info on exactly where this is failing?

e.g.

$ TORCH_SHOW_CPP_STACKTRACES=1 python3 test/test_jit.py -k test_del -v

Note: I think we do want to keep using assertRaisesRegexWithHighlight(), since in this case we're trying to check what's getting marked as "undefined value". But the test should be passing, since we can see below that a is getting highlighted:

                return a
                       ~ <--- HERE
lexming commented 2 years ago

@davidberard98 I run the test with TORCH_SHOW_CPP_STACKTRACES=1 but it did not generate any additional output

$ TORCH_SHOW_CPP_STACKTRACES=1 python3 test/test_jit.py -k test_del -v
Fail to import hypothesis in common_utils, tests are not derandomized
CUDA not available, skipping tests
monkeytype is not installed. Skipping tests for Profile-Directed Typing
test_del (jit.test_builtins.TestBuiltins) ... ERROR
test_del_multiple_operands (jit.test_builtins.TestBuiltins) ... ok
test_del (jit.test_list_dict.TestDict) ... ok
test_del (jit.test_list_dict.TestList) ... ok
test_delitem (jit.test_list_dict.TestScriptDict)
Test deletion. ... ok
test_delitem (jit.test_list_dict.TestScriptList)
Test deletion. ... ok

======================================================================
ERROR: test_del (jit.test_builtins.TestBuiltins)
----------------------------------------------------------------------
RuntimeError: 
undefined value a:
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/skylake/build/PyTorch/1.10.0/foss-2021a/pytorch/test/jit/test_builtins.py", line 94
                a = x ** 2
                del a
                return a
                       ~ <--- HERE

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/skylake/build/PyTorch/1.10.0/foss-2021a/pytorch/test/jit/test_builtins.py", line 91, in test_del
    def fn(x):
  File "/user/brussel/101/vsc10122/easybuild/install/skylake/tmp/eb-sn96YS/tmpWlRsMd/lib/python3.9/site-packages/torch/testing/_internal/jit_utils.py", line 92, in __exit__
    FileCheck().check_source_highlighted(self.highlight).run(str(value))
RuntimeError: Expected to find "a"highlighted but it is not.

undefined value a:
  File "/theia/scratch/brussel/vo/000/bvo00005/vsc10122/easybuild/install/skylake/build/PyTorch/1.10.0/foss-2021a/pytorch/test/jit/test_builtins.py", line 94
                a = x ** 2

----------------------------------------------------------------------
Ran 6 tests in 0.477s

FAILED (errors=1)

My original report used a shared filesystem (GPFS), which might have had a role due it's higher latency, but it's not the case. I also ran this test on a more traditional setup using a local filesystem and the detection of the highlight still fails.

Let me know if there is anything else I can do to provide more information.