state-spaces / mamba

Mamba SSM architecture
Apache License 2.0
13.14k stars 1.12k forks source link

causal-conv1d installation error #55

Open emadkavousi opened 11 months ago

emadkavousi commented 11 months ago

Hi . I am trying to install and use mamba but i cant install causal-conv1d with pip then I tried to build it from source but I get same error .please help me .

Building wheel for causal-conv1d (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [113 lines of output]

  torch.__version__  = 1.13.1+cu117

  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.0.2/causal_conv1d-1.0.2+cu118torch1.13cxx11abiFALSE-cp37-cp37-linux_x86_64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-37
  creating build/lib.linux-x86_64-cpython-37/causal_conv1d
  copying causal_conv1d/__init__.py -> build/lib.linux-x86_64-cpython-37/causal_conv1d
  copying causal_conv1d/causal_conv1d_interface.py -> build/lib.linux-x86_64-cpython-37/causal_conv1d
  running build_ext
  /root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py:387: UserWarning: The detected CUDA version (11.8) has a minor version mismatch with the version that was used to compile PyTorch (11.7). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  /root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py:397: UserWarning: There are no g++ version bounds defined for CUDA version 11.8
    warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
  building 'causal_conv1d_cuda' extension
  creating /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37
  creating /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/csrc
  Emitting ninja build file /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: '/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/csrc/causal_conv1d.cpp', needed by '/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/csrc/causal_conv1d.o', missing and no known rule to make it
  Traceback (most recent call last):
    File "/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/setup.py", line 207, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 247, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 222, in urlopen
      return opener.open(url, data, timeout)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 531, in open
      response = meth(req, response)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 641, in http_response
      'http', request, response, code, msg, hdrs)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 569, in error
      return self._call_chain(*args)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 503, in _call_chain
      result = func(*args)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/urllib/request.py", line 649, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1906, in _run_ninja_build
      env=env)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/subprocess.py", line 512, in run
      output=stdout, stderr=stderr)
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 36, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/setup.py", line 264, in <module>
      "ninja",
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/__init__.py", line 87, in setup
      return distutils.core.setup(**attrs)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/setup.py", line 224, in run
      super().run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 325, in run
      self.run_command("build")
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 84, in run
      _build_ext.run(self)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
      self.build_extensions()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
      build_ext.build_extensions(self)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 468, in build_extensions
      self._build_extensions_serial()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 494, in _build_extensions_serial
      self.build_extension(ext)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
      _build_ext.build_extension(self, ext)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 556, in build_extension
      depends=ext.depends,
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 668, in unix_wrap_ninja_compile
      with_cuda=with_cuda)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1578, in _write_ninja_file_and_compile_objects
      error_prefix='Error compiling objects for extension')
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d Installing collected packages: argparse, causal-conv1d Running setup.py install for causal-conv1d ... error error: subprocess-exited-with-error

× Running setup.py install for causal-conv1d did not run successfully. │ exit code: 1 ╰─> [92 lines of output]

  torch.__version__  = 1.13.1+cu117

  running install
  /root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
    setuptools.SetuptoolsDeprecationWarning,
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-37
  creating build/lib.linux-x86_64-cpython-37/causal_conv1d
  copying causal_conv1d/__init__.py -> build/lib.linux-x86_64-cpython-37/causal_conv1d
  copying causal_conv1d/causal_conv1d_interface.py -> build/lib.linux-x86_64-cpython-37/causal_conv1d
  running build_ext
  /root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py:387: UserWarning: The detected CUDA version (11.8) has a minor version mismatch with the version that was used to compile PyTorch (11.7). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  /root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py:397: UserWarning: There are no g++ version bounds defined for CUDA version 11.8
    warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
  building 'causal_conv1d_cuda' extension
  creating /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37
  creating /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/csrc
  Emitting ninja build file /tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: '/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/csrc/causal_conv1d.cpp', needed by '/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/build/temp.linux-x86_64-cpython-37/csrc/causal_conv1d.o', missing and no known rule to make it
  Traceback (most recent call last):
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1906, in _run_ninja_build
      env=env)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/subprocess.py", line 512, in run
      output=stdout, stderr=stderr)
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 36, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-install-ite1402y/causal-conv1d_6898db0f5bcd4aa9b82b3dd0dca603f7/setup.py", line 264, in <module>
      "ninja",
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/__init__.py", line 87, in setup
      return distutils.core.setup(**attrs)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/install.py", line 68, in run
      return orig.install.run(self)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/install.py", line 698, in run
      self.run_command('build')
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/dist.py", line 1208, in run_command
      super().run_command(command)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 84, in run
      _build_ext.run(self)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
      self.build_extensions()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
      build_ext.build_extensions(self)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 468, in build_extensions
      self._build_extensions_serial()
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 494, in _build_extensions_serial
      self.build_extension(ext)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
      _build_ext.build_extension(self, ext)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 556, in build_extension
      depends=ext.depends,
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 668, in unix_wrap_ninja_compile
      with_cuda=with_cuda)
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1578, in _write_ninja_file_and_compile_objects
      error_prefix='Error compiling objects for extension')
    File "/root/anaconda3/envs/DiffGesture/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

× Encountered error while trying to install package. ╰─> causal-conv1d

note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.

evelynmitchell commented 11 months ago

You are using pip with an anaconda installation of python. You may want to attempt installing causal-conv1d with conda, but I don't know if it will work.

ZiQi-Jiang commented 11 months ago

I have this problem,too. :)

Gyu1291 commented 11 months ago

I also have this problem 😭

hrbigelow commented 11 months ago

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

signalprime commented 10 months ago

I just realized it requires CUDA even building without

Marxist-Leninist commented 10 months ago

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

Could you link a copy of the files module you compiled to save everyone the hassle of doing all that via cloud storage

ankhzet commented 10 months ago

IIRC, it seems csrc directory is absent or is referenced on a wrong path, so installing causal-conv1d when there is no prebuilt wheel for your setup causes this error. I've run in to the same issue when installing mamba-chat repo locally on the Win machine, ended up by manually building causal-conv1d and added csrc from the repo into the correct location. Also, make sure nvcc and compiler binaries are in the PATH env before building.

Side note, i've ultimately failed to run it, due to absence of prebuilt triton bindings for my setup and lack of free time tho %).

fivejjs commented 10 months ago

I'm not familiar with ninja, but I was able to build causal_conv1d from source. First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do: sudo update-alternatives --config cuda and then set the CUDA alternatives version to the version reported by torch.version.cuda Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one. Please let me know if this works!

Could you link a copy of the files module you compiled to save everyone the hassle of doing all that via cloud storage

I found it works after just align the native cuda version to the pytorch cuda version.

takeraparterer commented 10 months ago

I'm not familiar with ninja, but I was able to build causal_conv1d from source.

First, note that these two commands should produce matching CUDA versions:

python3 -c 'import torch; print(torch.version.cuda)'
nvcc --version

since nvcc will be used during the build of causal_conv1d. If they don't, you might need to do:

sudo update-alternatives --config cuda

and then set the CUDA alternatives version to the version reported by torch.version.cuda

Then:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal_conv1d
git checkout v1.0.2  # this is the highest compatible version allowed by Mamba
CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

More detail is in this other issue although that issue doesn't particularly deal with this one.

Please let me know if this works!

` [ T=c10::AliasInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=c10::AliasInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/function_schema.h(28): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=c10::AliasInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::vector<c10::SymInt,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::vector<c10::SymInt,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::vector<c10::SymInt,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::vector<c10::SymInt,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/ivalue.h(96): note: see reference to class template instantiation 'c10::optional<std::vector<T,std::allocator>>' being compiled
with [ T=c10::SymInt ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h(378): note: see reference to class template instantiation 'c10::OptionalArray' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h(388): note: see reference to class template instantiation 'c10::impl::ivalue_to_arg<c10::OptionalArray,AllowDeprecatedTypes>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::vector<c10::SymInt,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=c10::either<c10::OperatorName,c10::FunctionSchema> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=c10::either<c10::OperatorName,c10::FunctionSchema> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=c10::either<c10::OperatorName,c10::FunctionSchema> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=c10::either<c10::OperatorName,c10::FunctionSchema> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/op_registration/op_registration.h(434): note: see reference to class template instantiation 'c10::optional<c10::either<c10::OperatorName,c10::FunctionSchema>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=c10::either<c10::OperatorName,c10::FunctionSchema> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=at::StepCallbacks ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=at::StepCallbacks ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=at::StepCallbacks ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=at::StepCallbacks ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/autograd/function.h(166): note: see reference to class template instantiation 'c10::optional' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=at::StepCallbacks ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=c10::DimVector ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=c10::DimVector ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=c10::DimVector ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=c10::DimVector ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/TensorIterator.h(918): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=c10::DimVector ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=c10::impl::AnnotatedSchema ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=c10::impl::AnnotatedSchema ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=c10::impl::AnnotatedSchema ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=c10::impl::AnnotatedSchema ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(223): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=c10::impl::AnnotatedSchema ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=c10::impl::OperatorEntry::CppSignatureWithDebug ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=c10::impl::OperatorEntry::CppSignatureWithDebug ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=c10::impl::OperatorEntry::CppSignatureWithDebug ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=c10::impl::OperatorEntry::CppSignatureWithDebug ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(286): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=c10::impl::OperatorEntry::CppSignatureWithDebug ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::tuple<std::string,size_t,size_t> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::tuple<std::string,size_t,size_t> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::tuple<std::string,size_t,size_t> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::tuple<std::string,size_t,size_t> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/frontend/source_range.h(357): note: see reference to class template instantiation 'c10::optional<std::tuple<std::string,size_t,size_t>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::tuple<std::string,size_t,size_t> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::SourceRange ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::SourceRange ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::SourceRange ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::SourceRange ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/frontend/source_range.h(380): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::SourceRange ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::InlinedCallStackPtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::InlinedCallStackPtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::InlinedCallStackPtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::InlinedCallStackPtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/scope.h(127): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::InlinedCallStackPtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::ModuleInstanceInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::ModuleInstanceInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::ModuleInstanceInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::ModuleInstanceInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/scope.h(140): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::ModuleInstanceInfo ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::ScopePtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::ScopePtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::ScopePtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::ScopePtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/ir/constants.h(29): note: see reference to class template instantiation 'c10::optional' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::ScopePtr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=at::ThreadLocalState ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=at::ThreadLocalState ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=at::ThreadLocalState ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=at::ThreadLocalState ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/runtime/interpreter.h(150): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=at::ThreadLocalState ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::shared_ptr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::shared_ptr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::shared_ptr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::shared_ptr ] C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include\array(577): note: see reference to class template instantiation 'c10::optional<std::shared_ptr>' being compiled
C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/function_impl.h(164): note: see reference to class template instantiation 'std::array<c10::optional<std::shared_ptr>,4>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::shared_ptr ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::GraphExecutor ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::GraphExecutor ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::GraphExecutor ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::GraphExecutor ] C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include\array(577): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/function_impl.h(178): note: see reference to class template instantiation 'std::array<c10::optional,4>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::GraphExecutor ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::jit::Method ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::jit::Method ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::jit::Method ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::jit::Method ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/object.h(48): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::jit::Method ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::vector<std::string,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::vector<std::string,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::vector<std::string,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::vector<std::string,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/jit/api/module.h(329): note: see reference to class template instantiation 'c10::optional<std::vector<std::string,std::allocator>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::vector<std::string,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::function<void (const torch::autograd::profiler::thread_event_lists &)> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::function<void (const torch::autograd::profiler::thread_event_lists &)> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::function<void (const torch::autograd::profiler::thread_event_lists &)> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::function<void (const torch::autograd::profiler::thread_event_lists &)> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch/csrc/autograd/profiler_legacy.h(411): note: see reference to class template instantiation 'c10::optional<std::function<void (const torch::autograd::profiler::thread_event_lists &)>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::function<void (const torch::autograd::profiler::thread_event_lists &)> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/options/loss.h(453): note: see reference to class template instantiation 'c10::optional' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=torch::nn::TripletMarginWithDistanceLossOptions::distance_function_t ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::vector<double,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::vector<double,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::vector<double,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::vector<double,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/options/upsampling.h(27): note: see reference to class template instantiation 'c10::optional<std::vector<T,std::allocator>>' being compiled with [ T=double ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::vector<double,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::tuple<at::Tensor,at::Tensor> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::tuple<at::Tensor,at::Tensor> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::tuple<at::Tensor,at::Tensor> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::tuple<at::Tensor,at::Tensor> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/nn/modules/rnn.h(162): note: see reference to class template instantiation 'c10::optional<std::tuple<at::Tensor,at::Tensor>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::tuple<at::Tensor,at::Tensor> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t': destructor was implicitly defined as deleted with [ T=std::vector<at::Tensor,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(411): note: see reference to class template instantiation 'c10::constexpr_storage_t' being compiled with [ T=std::vector<at::Tensor,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to class template instantiation 'c10::trivially_copyable_optimization_optional_base' being compiled with [ T=std::vector<at::Tensor,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(549): note: see reference to alias template instantiation 'c10::OptionalBase' being compiled with [ T=std::vector<at::Tensor,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/optim/lbfgs.h(49): note: see reference to class template instantiation 'c10::optional<std::vector<at::Tensor,std::allocator>>' being compiled C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/Optional.h(446): warning C4624: 'c10::trivially_copyable_optimization_optional_base': destructor was implicitly defined as deleted
with [ T=std::vector<at::Tensor,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(171): warning C4624: 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>': destructor was implicitly defined as deleted
with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(779): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>' being compiled with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(775): note: while compiling class template member function 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>> *ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::empty_default_table(void)' with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(768): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::empty_default_table' in 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::sherwood_v3_table' with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(1929): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>' being compiled with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\torch\csrc\api\include\torch/optim/optimizer.h(174): note: see reference to class template instantiation 'ska::flat_hash_map<std::string,std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete>,std::hash,std::equal_to,std::allocator<std::pair<K,V>>>' being compiled with [ K=std::string, V=std::unique_ptr<torch::optim::OptimizerParamState,std::default_delete> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(171): warning C4624: 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>': destructor was implicitly defined as deleted
with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(711): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_entry<std::pair<K,V>>' being compiled with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(708): note: while compiling class template member function 'void ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::clear(void)' with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(431): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::clear' in 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::~sherwood_v3_table' with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(2035): note: see the first reference to 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>::~sherwood_v3_table' in 'ska::flat_hash_map<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>,std::hash,std::equal_to,std::allocator<std::pair<K,V>>>::~flat_hash_map' with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>>> ] and [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\c10/util/flat_hash_map.h(1929): note: see reference to class template instantiation 'ska::detailv3::sherwood_v3_table<std::pair<K,V>,K,H,ska::detailv3::KeyOrValueHasher<K,std::pair<K,V>,H>,E,ska::detailv3::KeyOrValueEquality<K,std::pair<K,V>,E>,A,std::allocator<ska::detailv3::sherwood_v3_entry<std::pair<K,V>>>>' being compiled with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator>, H=std::hash, E=std::equal_to, A=std::allocator<std::pair<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>>> ] C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\include\ATen/core/dispatch/OperatorEntry.h(270): note: see reference to class template instantiation 'ska::flat_hash_map<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator>,std::hash,std::equal_to,std::allocator<std::pair<K,V>>>' being compiled with [ K=c10::DispatchKey, V=std::list<c10::impl::AnnotatedKernel,std::allocator> ] ninja: build stopped: subcommand failed. Traceback (most recent call last): File "C:\Users\xande\causal-conv1d\setup.py", line 207, in run urllib.request.urlretrieve(wheel_url, wheel_filename) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 241, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp: File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 525, in open response = meth(req, response) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 634, in http_response response = self.parent.error( File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 563, in error
return self._call_chain(args) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 496, in _call_chain
result = func(
args) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 643, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1893, in _run_ninja_build
      subprocess.run(
    File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\xande\causal-conv1d\setup.py", line 227, in <module>
      setup(
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\__init__.py", line 87, in setup 
      return distutils.core.setup(**attrs)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 968, in run_commands
      self.run_command(cmd)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\causal-conv1d\setup.py", line 224, in run
      super().run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\wheel\bdist_wheel.py", line 321, in run    
      self.run_command("build")
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\command\build.py", line 132, in run
      self.run_command(cmd_name)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\cmd.py", line 319, in run_command
      self.distribution.run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\dist.py", line 1217, in run_command
      super().run_command(command)
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\_distutils\dist.py", line 987, in run_command
      cmd_obj.run()
    File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\command\build_ext.py", line 84, 

in run _build_ext.run(self) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run _build_ext.build_ext.run(self) File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 346, in run self.build_extensions() File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 843, in build_extensions build_ext.build_extensions(self) File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions _build_ext.build_ext.build_extensions(self) File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 466, in build_extensions self._build_extensions_serial() File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 492, in _build_extensions_serial self.build_extension(ext) File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools\command\build_ext.py", line 246, in build_extension _build_ext.build_extension(self, ext) File "C:\Users\xande\AppData\Roaming\Python\Python310\site-packages\setuptools_distutils\command\build_ext.py", line 547, in build_extension objects = self.compiler.compile( File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 815, in win_wrap_ninja_compile _write_ninja_file_and_compile_objects( File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1574, in _write_ninja_file_and_compile_objects _run_ninja_build( File "C:\Users\xande\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\cpp_extension.py", line 1909, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error compiling objects for extension [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects`

sonsus commented 10 months ago

Suffering from similar issues. My message is like below. Hope I could find some good hint from this thread.

pip install mamba-ssm
Collecting mamba-ssm
  Using cached mamba_ssm-1.1.1.tar.gz (34 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [25 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-g51qu3z4/mamba-ssm_4988cde7cf824517a06f59b486aea78c/setup.py", line 101, in <module>
          _, bare_metal_version = get_cuda_bare_metal_version(CUDA_HOME)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-g51qu3z4/mamba-ssm_4988cde7cf824517a06f59b486aea78c/setup.py", line 63, in get_cuda_bare_metal_version
          raw_output = subprocess.check_output(
                       ^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 466, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 548, in run
          with Popen(*popenargs, **kwargs) as process:
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 1026, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "/root/miniconda3/envs/openai/lib/python3.11/subprocess.py", line 1950, in _execute_child
          raise child_exception_type(errno_num, err_msg, err_filename)
      FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda/bin/nvcc'

      torch.__version__  = 2.0.1+cu117

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Driver/CUDA versions from nvidia-smi NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0
torch==2.0.1, A100 machine.

ankhzet commented 10 months ago

@sonsus try to manually set CUDA_HOME env variable to the local cuda installation folder (currently it points to /usr/local/cuda for you)

duncanriach commented 9 months ago

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ MAMBA_FORCE_BUILD=TRUE pip install .

[ Setting the *_FORCE_BUILD=TRUE environment variables, as shown above, may avoid the need to carry out the following purging process. If you're accessing the cloned directories on a disk outside your container, you may need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory. ]

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: tri@tridao.me
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: tri@tridao.me, agu@cs.cmu.edu
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors
invokeG commented 9 months ago

@duncanriach Thank you! This solution is effective.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory.

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: tri@tridao.me
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: tri@tridao.me, agu@cs.cmu.edu
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors
shigen-StoneRoot commented 9 months ago

It works. I don't know why this sentence "CAUSAL_CONV1D_FORCE_BUILD=TRUE" is important. In fact, I have to run the two commands: $ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . $ MAMBA_FORCE_BUILD=TRUE pip install .

If without *_FORCE_BUILD=TRUE, the issue still occurs.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory.

Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: tri@tridao.me
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: tri@tridao.me, agu@cs.cmu.edu
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors
duncanriach commented 9 months ago

@shigen-StoneRoot, *_FORCE_BUILD=TRUE forces a fresh build locally, rather than trying to use the local cached results. I believe this makes my instructions about deleting the *.egg-info and build directories redundant. Thanks for pointing this out; I'll update my comment above.

Here are the places in the relevant setup.py files where *_FORCE_BUILD=TRUE is defined and documented causal_conv1d, mamba.

HelloWorldLTY commented 9 months ago

Hi, my suggestion for addressing this error is installing mamba first, and then reinstall the pytorch based on the default link, then everything works.

hannn0403 commented 8 months ago

I found a list of packages that need to be installed prior to installing causal-conv1d on the following page: https://github.com/havietisov/causal-conv1d/commit/84c68a2901136f3dedc467181978b065f7868234. After installing these packages ("torch", "packaging", "buildtools", "ninja") via pip, I then executed the command pip install causal-conv1d>=1.1.0 and confirmed that it was successfully installed. If you're still having trouble with the installation, trying this method might be a good idea.

Wave2689 commented 8 months ago

when I use the command CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . I got the error 'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file. and how to fix this error?

duncanriach commented 8 months ago

when I use the command CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . I got the error 'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file. and how to fix this error?

Are you not using bash as your shell? You might need to translate the commands into a format that is compatible with the shell that you're using.

ankhzet commented 8 months ago

@Wave2689

when I use the command CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install . I got the error 'CAUSAL_CONV1D_FORCE_BUILD' is not recognized as an internal or external command, operable program or batch file. and how to fix this error?

If you run the command in Windows command prompt, you might need to prefix env variables with export keyword:

drive:path\to\project> export CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

also sometimes you would need to execute the commands one after another:

drive:path\to\project> export CAUSAL_CONV1D_FORCE_BUILD=TRUE
drive:path\to\project> pip install .
Wave2689 commented 8 months ago

Thanks a lot! I know it. @ankhzet @duncanriach

invokeG commented 8 months ago

I created a Docker container to address the installation errors. https://hub.docker.com/repository/docker/kom4cr0/cuda11.7-pytorch1.13-mamba1.1.1/general

steve-zeyu-zhang commented 8 months ago

@duncanriach Thank you! This solution is effective.

Building on @hrbigelow's instructions above, in order to get the mamba-ssm package pip installed, I did the following inside an instance of container image nvcr.io/nvidia/pytorch:23.12-py3. I also confirmed that it worked using docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel.

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the mamba_ssm package into Python. To clean/purge, run rm -rf *.egg.info build in both the causal-conv1d clone directory and the mamba clone directory. Checking the installation:

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: tri@tridao.me
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: tri@tridao.me, agu@cs.cmu.edu
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

Sometimes this solution may not work at all, even with CAUSAL_CONV1D_FORCE_BUILD=TRUE, it will still appear the error shows below.

That's because it is nothing to do with CAUSAL_CONV1D_FORCE_BUILD=TRUE in this case.

Then, you may consider to reload your conda environment by conda deactivate and start again with simply pip install -e .. Then your problem will solved.

          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 184 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 185 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 226 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 121 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 66 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 123 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 79 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 62 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      /gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/csrc/causal_conv1d_bwd.cu(84): warning #2912-D: constexpr if statements are a C++17 feature

      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
          subprocess.run(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/subprocess.py", line 526, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 223, in <module>
          setup(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 198, in run
          return super().run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
          build_ext.build_extensions(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
          self._build_extensions_serial()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
          self.build_extension(ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
          _build_ext.build_extension(self, ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
          objects = self.compiler.compile(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
          _write_ninja_file_and_compile_objects(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1573, in _write_ninja_file_and_compile_objects
          _run_ninja_build(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for causal-conv1d
  Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects
CYYJL commented 8 months ago

@duncanriach HI,i use you approch to install the causal-conv1d ,but i have a new issue `Building wheels for collected packages: causal-conv1d Building wheel for causal-conv1d (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [60 lines of output]

  torch.__version__  = 1.13.1+cu117

  running bdist_wheel
  /home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
    warnings.warn(msg.format('we could not find ninja.'))
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.8
  creating build/lib.linux-x86_64-3.8/causal_conv1d
  copying causal_conv1d/causal_conv1d_interface.py -> build/lib.linux-x86_64-3.8/causal_conv1d
  copying causal_conv1d/__init__.py -> build/lib.linux-x86_64-3.8/causal_conv1d
  running build_ext
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/home/yjl/causal-conv1d/setup.py", line 227, in <module>
      setup(
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
      return distutils.core.setup(**attrs)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/causal-conv1d/setup.py", line 202, in run
      return super().run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
      _build_ext.run(self)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/distutils/command/build_ext.py", line 340, in run
      self.build_extensions()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 485, in build_extensions
      compiler_name, compiler_version = self._check_abi()
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 869, in _check_abi
      _, version = get_compiler_abi_compatibility_and_version(compiler)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 336, in get_compiler_abi_compatibility_and_version
      if not check_compiler_ok_for_platform(compiler):
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 290, in check_compiler_ok_for_platform
      which = subprocess.check_output(['which', compiler], stderr=subprocess.STDOUT)
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/subprocess.py", line 415, in check_output
      return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
    File "/home/yjl/anaconda3/envs/VMUnet/lib/python3.8/subprocess.py", line 516, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['which', 'g++']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects`

paaKways commented 8 months ago

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

paaKways commented 8 months ago

You'd need this as well if you're on Windows https://github.com/openai/triton/issues/1057#issuecomment-1624240321

CYYJL commented 8 months ago

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you. I am try to creat a new env in Ubuntu, and install this package as follows conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm and then it can successfully install causal-conv1d and mamba-ssm

CYYJL commented 8 months ago

You'd need this as well if you're on Windows openai/triton#1057 (comment)

triton, this package is seem need to be installed in Ubuntu, not windos I try to install in windos, but it failed

ajie220209 commented 8 months ago

CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .

An error occurred while executing CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install. ERROR: Could not build wheels for causal_conv1d, which is required to install pyproject.toml-based projects. What can I do to solve this problem?

ajie220209 commented 8 months ago

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

CYYJL commented 8 months ago

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

ajie220209 commented 8 months ago

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

Building wheels for collected packages: mamba-ssm Building wheel for mamba-ssm (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [132 lines of output]

  torch.__version__  = 2.1.1+cu118

  running bdist_wheel
  Guessing wheel URL:  https://github.com/state-spaces/mamba/releases/download/v1.2.0.post1/mamba_ssm-1.2.0.post1+cu118torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-310
  creating build\lib.win-amd64-cpython-310\mamba_ssm
  copying mamba_ssm\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm
  creating build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\config_mamba.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\mixer_seq_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  creating build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\mamba_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\selective_scan_interface.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  creating build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\generation.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\hf.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\layernorm.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\selective_state_update.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  running build_ext
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:383: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:414: UserWarning: The detected CUDA version (11.6) has a minor version mismatch with the version that was used to compile PyTorch (11.8). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  building 'selective_scan_cuda' extension
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc\selective_scan
  Emitting ninja build file C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/csrc/selective_scan/selective_scan.cpp', needed by 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/build/temp.win-amd64-cpython-310/Release/csrc/selective_scan/selective_scan.obj', missing and no known rule to make it
  Traceback (most recent call last):
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 214, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 241, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 216, in urlopen
      return opener.open(url, data, timeout)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 525, in open
      response = meth(req, response)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 634, in http_response
      response = self.parent.error(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 563, in error
      return self._call_chain(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 496, in _call_chain
      result = func(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 643, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 234, in <module>
      setup(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 231, in run
      super().run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\wheel\bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build.py", line 131, in run
      self.run_command(cmd_name)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 88, in run
      _build_ext.run(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 345, in run
      self.build_extensions()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 249, in build_extension
      _build_ext.build_extension(self, ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 845, in win_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for mamba-ssm Running setup.py clean for mamba-ssm Failed to build mamba-ssm ERROR: Could not build wheels for mamba-ssm, which is required to install pyproject.toml-based projects

CYYJL commented 7 months ago

根据我的经验(带有 Nvidia GPU 的 Windows 11),我没有安装 CUDA,所以必须从这里安装它并获取 Pytorch CUDA 版本,这基本上解决了它

好的,谢谢。我正在尝试在 Ubuntu 中创建一个新的环境,并按如下方式安装此软件包,然后它可以成功安装 causal-conv1d 和 mamba-ssmconda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm

I follow your steps and still get the same ERROR when I install causal-conv1d recently :ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

Could you show the detail about error?

Building wheels for collected packages: mamba-ssm Building wheel for mamba-ssm (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [132 lines of output]

  torch.__version__  = 2.1.1+cu118

  running bdist_wheel
  Guessing wheel URL:  https://github.com/state-spaces/mamba/releases/download/v1.2.0.post1/mamba_ssm-1.2.0.post1+cu118torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
  Precompiled wheel not found. Building from source...
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-310
  creating build\lib.win-amd64-cpython-310\mamba_ssm
  copying mamba_ssm\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm
  creating build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\config_mamba.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\mixer_seq_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  copying mamba_ssm\models\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\models
  creating build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\mamba_simple.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  copying mamba_ssm\modules\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\modules
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\selective_scan_interface.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  copying mamba_ssm\ops\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops
  creating build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\generation.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\hf.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  copying mamba_ssm\utils\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\utils
  creating build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\layernorm.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\selective_state_update.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  copying mamba_ssm\ops\triton\__init__.py -> build\lib.win-amd64-cpython-310\mamba_ssm\ops\triton
  running build_ext
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:383: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py:414: UserWarning: The detected CUDA version (11.6) has a minor version mismatch with the version that was used to compile PyTorch (11.8). Most likely this shouldn't be a problem.
    warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
  building 'selective_scan_cuda' extension
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc
  creating C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\csrc\selective_scan
  Emitting ninja build file C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\build\temp.win-amd64-cpython-310\Release\build.ninja...
  Compiling objects...
  Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
  ninja: error: 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/csrc/selective_scan/selective_scan.cpp', needed by 'C:/Users/cwj/AppData/Local/Temp/pip-install-dzppbsyx/mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a/build/temp.win-amd64-cpython-310/Release/csrc/selective_scan/selective_scan.obj', missing and no known rule to make it
  Traceback (most recent call last):
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 214, in run
      urllib.request.urlretrieve(wheel_url, wheel_filename)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 241, in urlretrieve
      with contextlib.closing(urlopen(url, data)) as fp:
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 216, in urlopen
      return opener.open(url, data, timeout)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 525, in open
      response = meth(req, response)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 634, in http_response
      response = self.parent.error(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 563, in error
      return self._call_chain(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 496, in _call_chain
      result = func(*args)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\urllib\request.py", line 643, in http_error_default
      raise HTTPError(req.full_url, code, msg, hdrs, fp)
  urllib.error.HTTPError: HTTP Error 404: Not Found

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2100, in _run_ninja_build
      subprocess.run(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\subprocess.py", line 524, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 234, in <module>
      setup(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\__init__.py", line 103, in setup
      return distutils.core.setup(**attrs)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\cwj\AppData\Local\Temp\pip-install-dzppbsyx\mamba-ssm_f3bebb4974fb41c5a722b11f29e8e70a\setup.py", line 231, in run
      super().run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\wheel\bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build.py", line 131, in run
      self.run_command(cmd_name)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\dist.py", line 989, in run_command
      super().run_command(command)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 88, in run
      _build_ext.run(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 345, in run
      self.build_extensions()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 873, in build_extensions
      build_ext.build_extensions(self)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 467, in build_extensions
      self._build_extensions_serial()
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 493, in _build_extensions_serial
      self.build_extension(ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\command\build_ext.py", line 249, in build_extension
      _build_ext.build_extension(self, ext)
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 548, in build_extension
      objects = self.compiler.compile(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 845, in win_wrap_ninja_compile
      _write_ninja_file_and_compile_objects(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
      _run_ninja_build(
    File "D:\pythonVE\anaconda3\envs\dltf\lib\site-packages\torch\utils\cpp_extension.py", line 2116, in _run_ninja_build
      raise RuntimeError(message) from e
  RuntimeError: Error compiling objects for extension
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for mamba-ssm Running setup.py clean for mamba-ssm Failed to build mamba-ssm ERROR: Could not build wheels for mamba-ssm, which is required to install pyproject.toml-based projects

Did you install mamba in the windows? You'd better install mamba on a Linux system, you can try installing the ubuntu virtual machine on windows and installing the mamba environment on the virtual machine

azxzxx commented 7 months ago

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you. I am try to creat a new env in Ubuntu, and install this package as follows conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm and then it can successfully install causal-conv1d and mamba-ssm

perfectly solve my problem thankssssssssssssssssss!

zhixuanli commented 7 months ago

In my experience (Windows 11 with Nvidia GPU) I didn't have CUDA installed so had to install that from here as well as getting Pytorch CUDA versions and this basically fixed it

Ok,thanks you. I am try to creat a new env in Ubuntu, and install this package as follows conda install cudatoolkit==11.8 -c nvidia pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc conda install packaging pip install causal-conv1d==1.1.1 pip install mamba-ssm and then it can successfully install causal-conv1d and mamba-ssm

Thank you so much for your commands!

Based on yours, I used the follows and successes with no bugs reported:

pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
conda install packaging
pip install causal-conv1d==1.1.1
pip install mamba-ssm
Lbaiall commented 6 months ago

@evelynmitchell just try to instal in the wsl system,i had encounter the same issue such like you ,but i turn into the Linux ,it work !

xiakexing-lmc commented 6 months ago

I also have this problem 😭 Have you solved the problem now?

Lbaiall commented 6 months ago

@xiakexing-lmc yes i do,but i think that issue was only case in to the window system,try with Ubuntu system

BNUWUU commented 6 months ago

Please help me solve, I also meet the same problems and tried some actions but don't work.

"Building wheels for collected packages: causal-conv1d Building wheel for causal-conv1d (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [8 lines of output]

  torch.__version__  = 2.1.1+cu121

  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.1/causal_conv1d-1.1.1+cu122torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
  error: <urlopen error [Errno 110] Connection timed out>
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects"

I list some version in my environment, as follows: python3 -c 'import torch; print(torch.version.cuda)' ---> 12.1 nvcc --version ---> 11.8 nvidia-smi --->12.0

Please give me some advices to solve this and let me run the code correctly ~~ Thanks a lot

CYYJL commented 6 months ago

Please help me solve, I also meet the same problems and tried some actions but don't work.

"Building wheels for collected packages: causal-conv1d Building wheel for causal-conv1d (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [8 lines of output]

  torch.__version__  = 2.1.1+cu121

  running bdist_wheel
  Guessing wheel URL:  https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.1/causal_conv1d-1.1.1+cu122torch2.1cxx11abiFALSE-cp38-cp38-linux_x86_64.whl
  error: <urlopen error [Errno 110] Connection timed out>
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for causal-conv1d Running setup.py clean for causal-conv1d Failed to build causal-conv1d ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects"

I list some version in my environment, as follows: python3 -c 'import torch; print(torch.version.cuda)' ---> 12.1 nvcc --version ---> 11.8 nvidia-smi --->12.0

Please give me some advices to solve this and let me run the code correctly ~~ Thanks a lot

Hi, your package is out of time, you can use the wheel url to download the casusal-conv1d package and then install it offline

wuliwuxin commented 6 months ago
CAUSAL_CONV1D_FORCE_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_FORCE_CXX11_ABI=TRUE pip install .

Success!

jiaoaoshirenjinbu commented 4 months ago
CAUSAL_CONV1D_FORCE_BUILD=TRUE CAUSAL_CONV1D_SKIP_CUDA_BUILD=TRUE CAUSAL_CONV1D_FORCE_CXX11_ABI=TRUE pip install .

Success!

it works!

ZhaiJiaKai commented 3 months ago

@duncanriach Thank you! This solution is effective.

Building on @hrbigelow's instructions above, in order to get the package pip installed, I did the following inside an instance of container image . I also confirmed that it worked using .mamba-ssm``nvcr.io/nvidia/pytorch:23.12-py3``docker.io/pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

$ git clone https://github.com/Dao-AILab/causal-conv1d.git
$ cd causal-conv1d
$ git checkout v1.1.1 # current latest version tag
$ CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .
$ cd ..
$ git clone https://github.com/state-spaces/mamba.git
$ cd mamba
$ git checkout v1.1.1 # current latest version tag
$ pip install .

Note that if you're accessing the cloned directories on a disk outside your container, you will need to clean/purge those directories before building inside different container versions. Otherwise, stale code and settings will cause the build to not work properly in your new container. This tends to show up as dynamic linking errors when importing the package into Python. To clean/purge, run in both the clone directory and the clone directory. Checking the installation:mamba_ssm``rm -rf *.egg.info build``causal-conv1d``mamba

$ pip show causal-conv1d
Name: causal-conv1d
Version: 1.1.1
Summary: Causal depthwise conv1d in CUDA, with a PyTorch interface
Home-page: https://github.com/Dao-AILab/causal-conv1d
Author: Tri Dao
Author-email: tri@tridao.me
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: packaging, torch, buildtools, ninja
Required-by: mamba-ssm

$ pip show mamba-ssm
Name: mamba-ssm
Version: 1.1.1
Summary: Mamba state-space model
Home-page: https://github.com/state-spaces/mamba
Author: Tri Dao, Albert Gu
Author-email: tri@tridao.me, agu@cs.cmu.edu
License: UNKNOWN
Location: /usr/local/lib/python3.10/dist-packages
Requires: einops, causal-conv1d, transformers, torch, packaging, ninja, triton
Required-by:

$ python
>>> import torch
>>> from mamba_ssm import Mamba

# no errors

Sometimes this solution may not work at all, even with , it will still appear the error shows below.CAUSAL_CONV1D_FORCE_BUILD=TRUE

That's because it is nothing to do with in this case.CAUSAL_CONV1D_FORCE_BUILD=TRUE

Then, you may consider to reload your conda environment by conda deactivate and start again with simply pip install -e .. Then your problem will solved.

          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfENS1_8BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c108BFloat16EELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 184 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 227 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16ENS1_4HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 122 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfES2_ELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 120 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EfN3c104HalfEELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 88 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 185 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 126 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 226 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 151 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 163 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 92 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c108BFloat16EfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 121 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 190 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 128 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 228 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 89 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 152 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 168 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 96 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EN3c104HalfEfELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 119 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 66 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 104 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi4ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 140 registers, 19728 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi3ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 123 registers, 19296 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 82 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb0ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 70 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb0EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 77 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z36causal_conv1d_channellast_bwd_kernelI43Causal_conv1d_channellast_bwd_kernel_traitsILi128ELi2ELi64ELb1ELb1EffELb1EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 86 registers, 18864 bytes smem, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfENS1_8BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c108BFloat16EEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 79 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 45 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 55 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 38 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 87 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16ENS1_4HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 62 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfES2_EEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EfN3c104HalfEEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 80 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 94 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 71 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 93 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 39 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c108BFloat16EfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 64 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 72 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EN3c104HalfEfEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi4ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi3ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 48 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb0EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 56 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb0ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 32 registers, 528 bytes cmem[0]
      ptxas info    : Compiling entry function '_Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd' for 'sm_70'
      ptxas info    : Function properties for _Z24causal_conv1d_bwd_kernelI31Causal_conv1d_bwd_kernel_traitsILi128ELi2ELb1ELb1EffEEv13ConvParamsBwd
          0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
      ptxas info    : Used 40 registers, 528 bytes cmem[0]
      /gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/csrc/causal_conv1d_bwd.cu(84): warning #2912-D: constexpr if statements are a C++17 feature

      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
          subprocess.run(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/subprocess.py", line 526, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 223, in <module>
          setup(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/default_studio/motion-latent-diffusion/causal-conv1d/setup.py", line 198, in run
          return super().run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
          build_ext.build_extensions(self)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
          self._build_extensions_serial()
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
          self.build_extension(ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
          _build_ext.build_extension(self, ext)
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
          objects = self.compiler.compile(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
          _write_ninja_file_and_compile_objects(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1573, in _write_ninja_file_and_compile_objects
          _run_ninja_build(
        File "/gpfs/users/a1796450/anaconda3/envs/motionmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for causal-conv1d
  Running setup.py clean for causal-conv1d
Failed to build causal-conv1d
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects

I have installed the causal_conv1d, but when I call it, I get No module named 'causal_conv1d_cuda', why is that?

luispc commented 3 months ago

I am facing the same problem. Trying to install on a conda environment in Windows 11. I am running the conda environment inside the "x64 Native tools..." command line. CUDA is installed with same version as torch. Any ideas about how to solve this? Is this really necessary to run mamba?