pytorch / benchmark

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
BSD 3-Clause "New" or "Revised" License
864 stars 279 forks source link

basic isntallation fails. #1971

Closed jdgh000 closed 11 months ago

jdgh000 commented 1 year ago

ubuntu 20.04, rocm 5.x


git remote -v
origin  https://github.com/pytorch/benchmark.git (fetch)
origin  https://github.com/pytorch/benchmark.git (push)
git branch
* main

python3 install.py 
checking packages torch, torchvision, torchaudio are installed...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/BERT_pytorch...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/Background_Matting...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/DALLE2_pytorch...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/LearningToPaint...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/Super_SloMo...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/alexnet...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/basic_gnn_edgecnn...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/basic_gnn_gcn...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/basic_gnn_gin...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/basic_gnn_sage...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/cm3leon_generate...SKIP - No install.py is found
running setup for /root/gg/git/benchmark/torchbenchmark/models/dcgan...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/demucs...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/densenet121...OK
running setup for /root/gg/git/benchmark/torchbenchmark/models/detectron2_fasterrcnn_r_101_c4...FAIL
Error for /root/gg/git/benchmark/torchbenchmark/models/detectron2_fasterrcnn_r_101_c4:
---------------------------------------------------------------------------
Checking out https://ossci-datasets.s3.amazonaws.com/torchbench/data/coco128.tar.gz to /root/gg/git/benchmark/torchbenchmark/data/coco128.tar.gz
decompressing input tarball: /root/gg/git/benchmark/torchbenchmark/data/coco128.tar.gz...  WARNING: Did not find branch or tag '1a4df4d', assuming revision or ref.
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-q9cs5wr_/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-q9cs5wr_/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-fuzeg0kg
       cwd: /tmp/pip-req-build-q9cs5wr_/
  Complete output (1098 lines):
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated.h [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated.h [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/cocoeval/cocoeval.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/cocoeval/cocoeval.h [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/deformable/deform_conv.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/deformable/deform_conv.h [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/nms_rotated/nms_rotated.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/nms_rotated/nms_rotated.h [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/vision.cpp -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/vision.cpp [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.cpp -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cpu.cpp [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_utils.h -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_utils_hip.h [skipped, already hipified]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.cpp -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu_hip.cpp [skipped, already hipified]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/cocoeval/cocoeval.cpp -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/cocoeval/cocoeval.cpp [skipped, no changes]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/nms_rotated/nms_rotated_cpu.cpp -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/nms_rotated/nms_rotated_cpu_hip.cpp [skipped, already hipified]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_cuda.cu -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/ROIAlignRotated/ROIAlignRotated_hip.hip [skipped, already hipified]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cuda.cu -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_hip.hip [skipped, already hipified]
  /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/deformable/deform_conv_cuda.cu -> /tmp/pip-req-build-q9cs5wr_/detectron2/layers/csrc/deformable/deform_conv_cuda.cu [skipped, no changes]
...
    In file included from /usr/local/lib/python3.8/dist-packages/torch/include/c10/core/ScalarType.h:5:
    In file included from /usr/local/lib/python3.8/dist-packages/torch/include/c10/util/Half.h:15:
    /usr/local/lib/python3.8/dist-packages/torch/include/c10/util/complex.h:8:10: fatal error: 'thrust/complex.h' file not found
    #include <thrust/complex.h>
             ^~~~~~~~~~~~~~~~~~
    26 warnings and 1 error generated when compiling for gfx1030.
    error: command '/opt/rocm-5.2.0/bin/hipcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-q9cs5wr_/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-q9cs5wr_/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-2pn80r5v/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.8/detectron2 Check the logs for full command output.
OK
Traceback (most recent call last):
  File "install.py", line 8, in <module>
    install_detectron2(MODEL_NAME, MODEL_DIR)
  File "/root/gg/git/benchmark/torchbenchmark/util/framework/detectron2/__init__.py", line 59, in install_detectron2
    pip_install_requirements()
  File "/root/gg/git/benchmark/torchbenchmark/util/framework/detectron2/__init__.py", line 41, in pip_install_requirements
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', '-r', requirements_file])
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', 'pip', 'install', '-q', '-r', '/root/gg/git/benchmark/torchbenchmark/util/framework/detectron2/requirements.txt']' returned non-zero exit status 1.

---------------------------------------------------------------------------

Traceback (most recent call last):
  File "install.py", line 65, in <module>
    raise RuntimeError("Failed to complete setup")
RuntimeError: Failed to complete setup
xuzhao9 commented 1 year ago

Torchbench only supports torch nightly, not stable release. Can you please try with torch nightly?

NathanielMcVicar commented 11 months ago

I'm experiencing what looks like the same thing, torch nightlies appear to be correct, log below. I suspect it's related to this issue: https://github.com/facebookresearch/detectron2/issues/4472

running setup for /home/namcvica/benchmark/torchbenchmark/models/detectron2_fasterrcnn_r_101_c4...decompressing input tarball: /home/namcvica/benchmark/torchbenchmark/data/coco128.tar.gz...OK
  WARNING: Did not find branch or tag '1a4df4d', assuming revision or ref.
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [17 lines of output]
      Traceback (most recent call last):
        File "/home/namcvica/torchbench/venv/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/namcvica/torchbench/venv/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/namcvica/torchbench/venv/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-vmawylf4/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 355, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-vmawylf4/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 325, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-vmawylf4/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 507, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-vmawylf4/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 10, in <module>
      ModuleNotFoundError: No module named 'torch'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Traceback (most recent call last):
  File "/home/namcvica/benchmark/torchbenchmark/models/detectron2_fasterrcnn_r_101_c4/install.py", line 8, in <module>
    install_detectron2(MODEL_NAME, MODEL_DIR)
  File "/home/namcvica/benchmark/torchbenchmark/util/framework/detectron2/__init__.py", line 59, in install_detectron2
    pip_install_requirements()
  File "/home/namcvica/benchmark/torchbenchmark/util/framework/detectron2/__init__.py", line 41, in pip_install_requirements
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', '-r', requirements_file])
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/namcvica/torchbench/venv/bin/python3', '-m', 'pip', 'install', '-q', '-r', '/home/namcvica/benchmark/torchbenchmark/util/framework/detectron2/requirements.txt']' returned non-zero exit status 1.
FAIL
xuzhao9 commented 11 months ago

Torchbench requires PyTorch to be available at install time. Could you please check that pytorch nightly is installed in your environment by running /home/namcvica/torchbench/venv/bin/python3 -c 'import torch;'? @NathanielMcVicar

NathanielMcVicar commented 11 months ago

Thanks for taking a look! That works. I don't think there are any environment issues. The other tests all install and run correctly, I can mostly run_benchmark.py dynamo as well. I believe this is a detectron2 specific issue.

jdgh000 commented 11 months ago

it seems later version of rocm worked.