ROCm / HIPIFY

HIPIFY: Convert CUDA to Portable C++ Code
https://rocm.docs.amd.com/projects/HIPIFY/en/latest/
MIT License
500 stars 70 forks source link

[hipify_python] recursion error reached, xformers (hipifying nv_cutlass) #660

Closed cornpo closed 1 year ago

cornpo commented 1 year ago

hipify-python from rocm runs into a maximum recursion error.

  1. Install rocm 5.3
  2. pip install git+https://github.com/facebookresearch/xformers or pip install xformers
  3. Wish you'd spent the extra $600
× python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [1508 lines of output]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha_utils.h -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha_utils_hip.h [ok]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha.h -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha_hip.h [ok]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/fmha_api.cpp -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/fmha_api_hip.cpp [ok]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/static_switch.h -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/static_switch.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fp16_switch.h -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fp16_switch.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/philox.cuh -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/philox.cuh [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha/utils.h -> /tmp/pip-req-build-r9do6iix/third_party/flash-attention/csrc/flash_attn/src/fmha/utils_hip.h [ok]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/cutlass.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/cutlass.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/platform/platform.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/platform/platform.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/integer_subbyte.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/integer_subbyte.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/half.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/half_hip.h [ok]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/bfloat16.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/bfloat16.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/tfloat32.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/tfloat32.h [skipped, no changes]
      /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/numeric_types.h -> /tmp/pip-req-build-r9do6iix/third_party/cutlass/include/cutlass/numeric_types_hip.h [ok]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-req-build-r9do6iix/setup.py", line 258, in <module>
          ext_modules=get_extensions(),
        File "/tmp/pip-req-build-r9do6iix/setup.py", line 214, in get_extensions
          ext_modules += get_flash_attention_extensions(
        File "/tmp/pip-req-build-r9do6iix/setup.py", line 108, in get_flash_attention_extensions
          CUDAExtension(
        File "/home/cornpop/conda/envs/shit/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1010, in CUDAExtension
          hipify_result = hipify_python.hipify(
        File "/home/cornpop/conda/envs/shit/lib/python3.9/site-packages/torch/utils/hipify/hipify_python.py", line 1042, in hipify
          preprocess_file_and_save_result(output_directory, filepath, all_files, header_include_dirs,
        File "/home/cornpop/conda/envs/shit/lib/python3.9/site-packages/torch/utils/hipify/hipify_python.py", line 176, in preprocess_file_and_save_result
          result = preprocessor(output_directory, filepath, all_files, header_include_dirs, stats,
        File "/home/cornpop/conda/envs/shit/lib/python3.9/site-packages/torch/utils/hipify/hipify_python.py", line 829, in preprocessor
          output_source = RE_QUOTE_HEADER.sub(mk_repl('#include "{0}"', True), output_source)
        File "/home/cornpop/conda/envs/shit/lib/python3.9/site-packages/torch/utils/hipify/hipify_python.py", line 819, in repl

Then it repeats hipify_python.py until pythons recursion limit is exceeded.

Expected behavior

hipify_python.py to hipnotize cutlass and xformers

Environment

PyTorch version: 1.12.1+rocm5.1.1 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 5.1.20531-cacfa990

OS: Ubuntu 22.04.1 LTS (x86_64) GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0 Clang version: 14.0.0-1ubuntu1 CMake version: version 3.23.1 Libc version: glibc-2.35

Python version: 3.9.13 (main, Oct 13 2022, 21:15:33) [GCC 11.2.0] (64-bit runtime) Python platform: Linux-5.15.0-50-generic-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 11.5.119 GPU models and configuration: AMD Radeon RX 6800 XT Nvidia driver version: Could not collect cuDNN version: Could not collect HIP runtime version: 5.1.20531 MIOpen runtime version: 2.16.0 Is XNNPACK available: True

Versions of relevant libraries: [pip3] clip-anytorch==2.5.0 [pip3] mypy-extensions==0.4.3 [pip3] numpy==1.21.6 [pip3] pytorch-benchmark==0.3.4 [pip3] pytorch-lightning==1.7.7 [pip3] torch==1.12.1+rocm5.1.1 [pip3] torch-ort==1.12.0 [pip3] torchaudio==0.12.1+rocm5.1.1 [pip3] torchdiffeq==0.2.3 [pip3] torchmetrics==0.10.0 [pip3] torchtext==0.13.1 [pip3] torchvision==0.13.1+rocm5.1.1 [conda] clip-anytorch 2.5.0 pypi_0 pypi [conda] numpy 1.21.6 pypi_0 pypi [conda] pytorch-benchmark 0.3.4 pypi_0 pypi [conda] pytorch-lightning 1.7.7 pypi_0 pypi [conda] torch 1.12.1+rocm5.1.1 pypi_0 pypi [conda] torch-ort 1.12.0 pypi_0 pypi [conda] torchaudio 0.12.1+rocm5.1.1 pypi_0 pypi [conda] torchdiffeq 0.2.3 pypi_0 pypi [conda] torchmetrics 0.10.0 pypi_0 pypi [conda] torchtext 0.13.1 pypi_0 pypi [conda] torchvision 0.13.1+rocm5.1.1 pypi_0 pypi

I'll put the link to the cutlass bug report here later, and vice versa. fyi, they're out here https://github.com/HazyResearch/flash-attention/issues/6 jamming cutlass into everything as fast as they can.

https://github.com/facebookresearch/xformers/issues/485

emankov commented 1 year ago

hipify_python.py is not a part of HIPIFY tools. Probably, it lives here: https://github.com/ROCmSoftwarePlatform/hipify_torch.

emankov commented 1 year ago

I can't transfer this issue to the hipify_torch repo, so, please create a new one there. Closing this as unrelated.

cornpo commented 1 year ago

I don't have hipify_python.py in /opt/rocm, but it is in /site-packages of my conda environment. I suppose torch+rocm python package does that. So, I cloned the latest https://github.com/ROCmSoftwarePlatform/hipify_torch and did diff hipify_python.py ~/ml/hipify_torch/hipify/hipify_python.py from inside site-packages. Thus comparing what torch installs vs. what's in the ROCm github.

Just leaving this as friendly breadcrumbs for the curious. You think "things are hipified with tools from the rocm directory" as I did. In fact they're supplied by Pytorch itself. That's here in ROCm https://github.com/ROCmSoftwarePlatform/hipify_torch and here in Pytorch https://github.com/pytorch/pytorch/blob/ff5fe9e62284cb0a3ca6976c40978c9022c4503f/torch/utils/hipify/hipify_python.py

Thanks, Evgeny.

emankov commented 1 year ago

@cornpo,

hipify_python.py being a senseless fork of HIPIFY tools is unrelated. Please, file a ticket to its authors.

Thanks

cornpo commented 1 year ago

https://github.com/ROCmSoftwarePlatform/hipify_torch/issues/39

Do I work for AMD? Nope. You do though. So how about you put your bug report where it's supposed to go.

emankov commented 1 year ago

So how about you put your bug report where it's supposed to go.

  1. It is not my bug report, it is your bug report, erroneously filed by you to the wrong repository initially.
  2. It is up to you to file or not the report to the correct repository.
  3. It is an open-source project, but it doesn't mean that everything unrelated to the project should be classified and reassigned by the project collaborators.
  4. I've just informed you that I can't transfer your bug report to the correct repository, but it doesn't mean that I have to do it.

Regards, Evgeny

cornpo commented 1 year ago

I am not responsible for AMD's product. One bug report to the people getting paid to make GPU's is enough. I'm not whining about a pull to openoffice. You, AMD as it were, sell machines and make promises behind that.

Anyway, you do you and I'll put my money elsewhere.

jithunnair-amd commented 1 year ago

This PR https://github.com/pytorch/pytorch/pull/104085 should help with the circular recursion issue observed when using pytorch's hipify

foreverlms commented 1 month ago

Ridiculous. AMD wants to keep pace with Nvidia but keeps only one active matainer for this project.