salesforce / warp-drive

Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)
BSD 3-Clause "New" or "Revised" License
465 stars 78 forks source link

Failure when running test script #79

Closed RongjianLiang closed 1 year ago

RongjianLiang commented 1 year ago

Hi! I encounter a failure when running the test script, with following command: python utils/unittests/run_unittests_pycuda.py Note that I am running this inside the ~/warp-drive/warp_drive directory. The test results are attached at the bottom. As you can see, there is one failure when running the TestEnvironmentReset.test_reset_for_different_dim function. May I know how does this error affects the package? Or how should I troubleshoot this? Since I would be using warp drive to develop my customized multi-agent RL environment to run on single GPU device.

I clone and install warpdrive from github, inside a seperate environment in miniconda. The OS I am using is Ubuntu 22.04, and I have nvidia driver and cuda toolkit installed. I have tested torch.cuda.is_available() in this separate environment, and it returns True.

As for other two test commands, they pass without errors, just with some deprecation warnings and skipping tests for multiple GPUs (which is fine, since I only have a single GPU).

This is my first time openning an issue, and many thanks for your assistance! I would be happy to provide more information when requested. I have also attached the packages installed inside the same environment as well, after the test result.

Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 8 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_action_sampler.py . [ 12%]
..                                                                       [ 37%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_data_manager.py . [ 50%]
..                                                                       [ 75%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_env_reset.py . [ 87%]
                                                                         [ 87%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/warp_drive/pycuda_tests/test_function_manager.py . [100%]

============================== 8 passed in 5.07s ===============================
Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 5 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py . [ 20%]
                                                                         [ 20%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld.py . [ 40%]
                                                                         [ 40%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld_step_cuda.py . [ 60%]
                                                                         [ 60%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_gridworld_step_python.py . [ 80%]
.                                                                        [100%]

=============================== warnings summary ===============================
miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py: 24 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:42: DeprecationWarning: WARN: Function `rng.rand(*size)` is marked as deprecated and will be removed in the future. Please use `Generator.random(size)` instead.
    "Function `rng.rand(*size)` is marked as deprecated "

miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/example_envs/pycuda_tests/test_tag_continuous.py: 8000 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:64: DeprecationWarning: WARN: Function `rng.randint(low, [high, size, dtype])` is marked as deprecated and will be removed in the future. Please use `rng.integers(low, [high, size, dtype])` instead.
    "Function `rng.randint(low, [high, size, dtype])` is marked as deprecated "

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 5 passed, 8024 warnings in 6.67s =======================
/home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/warp_drive/cuda_includes/../../example_envs/tag_gridworld/tag_gridworld_step_pycuda.cu(151): warning #2361-D: invalid narrowing conversion from "unsigned int" to "int"
      int global_state_arr_shape[] = {gridDim.x, wkNumberAgents};
                                      ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 5 items                                                              

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_action_sampler_multiblocks.py . [ 20%]
..                                                                       [ 60%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py F [ 80%]
                                                                         [ 80%]
../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_function_manager_multiblocks.py . [100%]

=================================== FAILURES ===================================
______________ TestEnvironmentReset.test_reset_for_different_dim _______________

self = <tests.multiblocks_per_env.warp_drive.pycuda_tests.test_env_reset_multiblocks.TestEnvironmentReset testMethod=test_reset_for_different_dim>

    def test_reset_for_different_dim(self):

        self.dm.data_on_device_via_torch("_done_")[:] = torch.from_numpy(
            np.array([1, 0])
        ).cuda()

        done = self.dm.pull_data_from_device("_done_")
        self.assertSequenceEqual(list(done), [1, 0])

        data_feed = DataFeed()
        data_feed.add_data(
            name="a", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="b", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="c", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        data_feed.add_data(
            name="d",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        data_feed.add_data(
            name="e",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        data_feed.add_data(
            name="f",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )

        self.dm.push_data_to_device(data_feed)

        torch_data_feed = DataFeed()
        torch_data_feed.add_data(
            name="at", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="bt", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="ct", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        torch_data_feed.add_data(
            name="dt",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed.add_data(
            name="et",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed.add_data(
            name="ft",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        self.dm.push_data_to_device(torch_data_feed, torch_accessible=True)

        a = self.dm.pull_data_from_device("a")
        b = self.dm.pull_data_from_device("b")
        c = self.dm.pull_data_from_device("c")
        d = self.dm.pull_data_from_device("d")
        e = self.dm.pull_data_from_device("e")
        f = self.dm.pull_data_from_device("f")
        at = self.dm.pull_data_from_device("at")
        bt = self.dm.pull_data_from_device("bt")
        ct = self.dm.pull_data_from_device("ct")
        dt = self.dm.pull_data_from_device("dt")
        et = self.dm.pull_data_from_device("et")
        ft = self.dm.pull_data_from_device("ft")

        # change the value in place
        self.dm.data_on_device_via_torch("at")[:] = torch.rand(2, 10, 3).cuda()
        self.dm.data_on_device_via_torch("bt")[:] = torch.rand(2, 10).cuda()
        self.dm.data_on_device_via_torch("ct")[:] = torch.rand(2).cuda()
        self.dm.data_on_device_via_torch("dt")[:] = torch.randint(
            10, size=(2, 10, 3)
        ).cuda()
        self.dm.data_on_device_via_torch("et")[:] = torch.randint(
            10, size=(2, 10)
        ).cuda()
        self.dm.data_on_device_via_torch("ft")[:] = torch.randint(10, size=(2,)).cuda()

        self.resetter.reset_when_done(self.dm)

        a_after_reset = self.dm.pull_data_from_device("a")
        b_after_reset = self.dm.pull_data_from_device("b")
        c_after_reset = self.dm.pull_data_from_device("c")
        d_after_reset = self.dm.pull_data_from_device("d")
        e_after_reset = self.dm.pull_data_from_device("e")
        f_after_reset = self.dm.pull_data_from_device("f")

        at_after_reset = self.dm.pull_data_from_device("at")
        bt_after_reset = self.dm.pull_data_from_device("bt")
        ct_after_reset = self.dm.pull_data_from_device("ct")
        dt_after_reset = self.dm.pull_data_from_device("dt")
        et_after_reset = self.dm.pull_data_from_device("et")
        ft_after_reset = self.dm.pull_data_from_device("ft")

        self.assertTrue(np.absolute((a - a_after_reset).mean()) < 1e-5)
        self.assertTrue(np.absolute((b - b_after_reset).mean()) < 1e-5)
        self.assertTrue(np.absolute((c - c_after_reset).mean()) < 1e-5)
        self.assertTrue(np.count_nonzero(d - d_after_reset) == 0)
        self.assertTrue(np.count_nonzero(e - e_after_reset) == 0)
        self.assertTrue(np.count_nonzero(f - f_after_reset) == 0)

        # so after the soft reset, only env_0 got reset because it has done flag on
        self.assertTrue(np.absolute((at - at_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((bt - bt_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((ct - ct_after_reset)[0].mean()) < 1e-5)
        self.assertTrue(np.absolute((at - at_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.absolute((bt - bt_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.absolute((ct - ct_after_reset)[1].mean()) > 1e-5)
        self.assertTrue(np.count_nonzero((dt - dt_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((et - et_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((ft - ft_after_reset)[0]) == 0)
        self.assertTrue(np.count_nonzero((dt - dt_after_reset)[1]) > 0)
        self.assertTrue(np.count_nonzero((et - et_after_reset)[1]) > 0)
        self.assertTrue(np.count_nonzero((ft - ft_after_reset)[1]) >= 0)

        done = self.dm.pull_data_from_device("_done_")
        self.assertSequenceEqual(list(done), [0, 0])

        # Now test if mode="force_reset" works

        torch_data_feed2 = DataFeed()
        torch_data_feed2.add_data(
            name="af", data=np.random.randn(2, 10, 3), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="bf", data=np.random.randn(2, 10), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="cf", data=np.random.randn(2), save_copy_and_apply_at_reset=True
        )
        torch_data_feed2.add_data(
            name="df",
            data=np.random.randint(10, size=(2, 10, 3), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed2.add_data(
            name="ef",
            data=np.random.randint(10, size=(2, 10), dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        torch_data_feed2.add_data(
            name="ff",
            data=np.random.randint(10, size=2, dtype=np.int32),
            save_copy_and_apply_at_reset=True,
        )
        self.dm.push_data_to_device(torch_data_feed2, torch_accessible=True)

        af = self.dm.pull_data_from_device("af")
        bf = self.dm.pull_data_from_device("bf")
        cf = self.dm.pull_data_from_device("cf")
        df = self.dm.pull_data_from_device("df")
        ef = self.dm.pull_data_from_device("ef")
        ff = self.dm.pull_data_from_device("ff")

        # change the value in place
        self.dm.data_on_device_via_torch("af")[:] = torch.rand(2, 10, 3).cuda()
        self.dm.data_on_device_via_torch("bf")[:] = torch.rand(2, 10).cuda()
        self.dm.data_on_device_via_torch("cf")[:] = torch.rand(2).cuda()
        self.dm.data_on_device_via_torch("df")[:] = torch.randint(
            10, size=(2, 10, 3)
        ).cuda()
        self.dm.data_on_device_via_torch("ef")[:] = torch.randint(
            10, size=(2, 10)
        ).cuda()
        self.dm.data_on_device_via_torch("ff")[:] = torch.randint(10, size=(2,)).cuda()

        self.resetter.reset_when_done(self.dm)

        af_after_soft_reset = self.dm.pull_data_from_device("af")
        bf_after_soft_reset = self.dm.pull_data_from_device("bf")
        cf_after_soft_reset = self.dm.pull_data_from_device("cf")
        df_after_soft_reset = self.dm.pull_data_from_device("df")
        ef_after_soft_reset = self.dm.pull_data_from_device("ef")
        ff_after_soft_reset = self.dm.pull_data_from_device("ff")

        self.assertTrue(np.absolute((af - af_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.absolute((bf - bf_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.absolute((cf - cf_after_soft_reset).mean()) > 1e-5)
        self.assertTrue(np.count_nonzero(df - df_after_soft_reset) > 0)
        self.assertTrue(np.count_nonzero(ef - ef_after_soft_reset) > 0)
>       self.assertTrue(np.count_nonzero(ff - ff_after_soft_reset) > 0)
E       AssertionError: False is not true

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py:236: AssertionError
------------------------------ Captured log setup ------------------------------
WARNING  root:function_manager.py:57 
                `num_agents` cannot be divisible by `blocks_per_env`.
                Therefore, the running threads for the last block could
                possibly EXCEED the boundaries of the output arrays and
                incurs index our-of-range bugs.
                Consider to have a proper thread index boundary check,
                for example if you have already checked
                `if (kThisAgentId < NumAgents)`, please ignore this warning.
------------------------------ Captured log call -------------------------------
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'a' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'b' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'c' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'at' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'bt' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'ct' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'af' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'bf' from type float64 to float32
WARNING  root:data_manager.py:427 PyCUDADataManager casts the data 'cf' from type float64 to float32
=========================== short test summary info ============================
FAILED ../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/warp_drive/pycuda_tests/test_env_reset_multiblocks.py::TestEnvironmentReset::test_reset_for_different_dim - AssertionError: False is not true
========================= 1 failed, 4 passed in 3.85s ==========================
Running Unit tests: pytest /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests 
============================= test session starts ==============================
platform linux -- Python 3.7.16, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/hibiki
collected 1 item                                                               

../../miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py . [100%]

=============================== warnings summary ===============================
miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py: 24 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:42: DeprecationWarning: WARN: Function `rng.rand(*size)` is marked as deprecated and will be removed in the future. Please use `Generator.random(size)` instead.
    "Function `rng.rand(*size)` is marked as deprecated "

miniconda3/envs/warp_drive/lib/python3.7/site-packages/tests/multiblocks_per_env/example_envs/pycuda_tests/test_tag_continuous_multiblocks.py: 8000 warnings
  /home/hibiki/miniconda3/envs/warp_drive/lib/python3.7/site-packages/gym/utils/seeding.py:64: DeprecationWarning: WARN: Function `rng.randint(low, [high, size, dtype])` is marked as deprecated and will be removed in the future. Please use `rng.integers(low, [high, size, dtype])` instead.
    "Function `rng.randint(low, [high, size, dtype])` is marked as deprecated "

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================= 1 passed, 8024 warnings in 6.55s =======================

running conda list with the environment activated return the following:

# packages in environment at /home/hibiki/miniconda3/envs/warp_drive:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
aiohttp                   3.8.4                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
async-timeout             4.0.2                    pypi_0    pypi
asynctest                 0.13.0                   pypi_0    pypi
attrs                     23.1.0                   pypi_0    pypi
boost-cpp                 1.78.0               h6582d0a_3    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2023.5.7             hbcca054_0    conda-forge
certifi                   2023.5.7           pyhd8ed1ab_0    conda-forge
charset-normalizer        3.1.0                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
cudatoolkit               11.8.0              h37601d7_11    conda-forge
cycler                    0.11.0                   pypi_0    pypi
exceptiongroup            1.1.1                    pypi_0    pypi
fonttools                 4.38.0                   pypi_0    pypi
frozenlist                1.3.3                    pypi_0    pypi
fsspec                    2023.1.0                 pypi_0    pypi
gym                       0.25.2                   pypi_0    pypi
gym-notices               0.0.8                    pypi_0    pypi
icu                       72.1                 hcb278e6_0    conda-forge
idna                      3.4                      pypi_0    pypi
importlib-metadata        6.6.0                    pypi_0    pypi
iniconfig                 2.0.0                    pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1  
libblas                   3.9.0           16_linux64_openblas    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lightning-utilities       0.8.0                    pypi_0    pypi
llvm-openmp               16.0.4               h4dfa4b3_0    conda-forge
llvmlite                  0.39.1                   pypi_0    pypi
mako                      1.2.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1            py37h540881e_1    conda-forge
matplotlib                3.5.3                    pypi_0    pypi
multidict                 6.0.4                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
numba                     0.56.4                   pypi_0    pypi
numpy                     1.21.6           py37h976b520_0    conda-forge
openssl                   1.1.1t               h0b41bf4_0    conda-forge
packaging                 23.1                     pypi_0    pypi
pillow                    9.5.0                    pypi_0    pypi
pip                       22.3.1           py37h06a4308_0  
platformdirs              3.5.1              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0                    pypi_0    pypi
pycuda                    2022.1           py37h790c342_1    conda-forge
pyparsing                 3.0.9                    pypi_0    pypi
pytest                    7.3.1                    pypi_0    pypi
python                    3.7.16               h7a1cb2a_0  
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytools                   2022.1.14          pyhd8ed1ab_0    conda-forge
pytorch-lightning         1.9.5                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
readline                  8.2                  h5eee18b_0  
requests                  2.31.0                   pypi_0    pypi
rl-warp-drive             2.3                      pypi_0    pypi
setuptools                65.6.3           py37h06a4308_0  
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0  
tk                        8.6.12               h1ccaba5_0  
tomli                     2.0.1                    pypi_0    pypi
torch                     1.10.2                   pypi_0    pypi
torchmetrics              0.11.4                   pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
typing-extensions         4.6.2                hd8ed1ab_0    conda-forge
typing_extensions         4.6.2              pyha770c72_0    conda-forge
urllib3                   2.0.2                    pypi_0    pypi
wheel                     0.38.4           py37h06a4308_0  
xz                        5.4.2                h5eee18b_0  
yarl                      1.9.2                    pypi_0    pypi
zipp                      3.15.0                   pypi_0    pypi
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge
Emerald01 commented 1 year ago

I did not reproduce the error on my side, it is actually passing. I have to say, maybe you could run it again to see if it could be reproduced. BTW, starting at version 2.0, we slowly deprecate the "pycuda" kernel because it is hard to maintain and also difficult for average users to develop (it could be the fastest though), instead, we think "Numba" kernel would be a better kernel for most of the future usage.

RongjianLiang commented 1 year ago

Thanks for the reply! I rerun the tests for that command again and they all pass. I note that the deprecation warning from pycuda kernel tests as pycuda is phasing out. And thanks for the additional info on changing of kernel support! I may be too junior right now to utlize the backend fully but hope to find it helpful in the future...

Emerald01 commented 1 year ago

Sure, yes, even that test failed, I do not think it is a problem as it is for multiblocks testing, and 99% chance most users would never use that in any case.