evalcrafter / EvalCrafter

[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
http://evalcrafter.github.io
115 stars 7 forks source link

error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device #10

Closed fjscfy closed 1 month ago

fjscfy commented 4 months ago

When I used the Docker image to execute scripts for color_score, count_score, and detection_score, I encountered an error: error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device What caused this issue?

Yaofang-Liu commented 4 months ago

Hi, fjscfy. The error "no kernel image is available for execution on the device" you're experiencing typically indicates a mismatch between the CUDA version used in the Docker image and the GPU architecture of your hardware, or that the Docker container is not properly configured to access the GPU resources.

Here’s a summary of the possible solutions and steps to resolve the issue:

  1. Ensure Docker Access to GPU:

    • Make sure you're using the correct Docker runtime options to enable GPU access. For newer Docker installations, use --gpus all to ensure that all available GPUs on the host are accessible to the Docker container. For example:
      docker run --gpus all -it --shm-size "16G" <image_name>
    • If you are using an older version of Docker or NVIDIA Docker, you might need --runtime=nvidia instead.
  2. Check CUDA and GPU Compatibility:

    • Verify that the CUDA version within the Docker container (CUDA 11.7 we used) is compatible with your GPU’s compute capability. You can check this compatibility on NVIDIA's website or by running nvidia-smi on your host system.
  3. Recompile CUDA Code:

    • If there’s a specific need for your application, consider recompiling the CUDA kernels within your Docker image to match your GPU’s compute capabilities. This involves adjusting the build scripts or Dockerfile to include appropriate CUDA architecture flags (-gencode arch=compute_XX,code=sm_XX).
  4. Update NVIDIA Drivers:

    • Ensure that the NVIDIA drivers on your host are up-to-date as per the requirements of CUDA 11.7.
  5. Test CUDA Setup in Docker:

    • Run CUDA sample programs like deviceQuery inside the Docker container to check if the setup correctly recognizes and uses the GPU.

If you continue to face issues after following these steps, could you please provide the following additional information?

This information will help us provide a more targeted and effective solution. Hope this helps!

fjscfy commented 4 months ago

Hi, fjscfy. The error "no kernel image is available for execution on the device" you're experiencing typically indicates a mismatch between the CUDA version used in the Docker image and the GPU architecture of your hardware, or that the Docker container is not properly configured to access the GPU resources.

Here’s a summary of the possible solutions and steps to resolve the issue:

1. **Ensure Docker Access to GPU**:

   * Make sure you're using the correct Docker runtime options to enable GPU access. For newer Docker installations, use `--gpus all` to ensure that all available GPUs on the host are accessible to the Docker container. For example:
     ```shell
     docker run --gpus all -it --shm-size "16G" <image_name>
     ```
   * If you are using an older version of Docker or NVIDIA Docker, you might need `--runtime=nvidia` instead.

2. **Check CUDA and GPU Compatibility**:

   * Verify that the CUDA version within the Docker container (CUDA 11.7 we used) is compatible with your GPU’s compute capability. You can check this compatibility on NVIDIA's website or by running `nvidia-smi` on your host system.

3. **Recompile CUDA Code**:

   * If there’s a specific need for your application, consider recompiling the CUDA kernels within your Docker image to match your GPU’s compute capabilities. This involves adjusting the build scripts or Dockerfile to include appropriate CUDA architecture flags (`-gencode arch=compute_XX,code=sm_XX`).

4. **Update NVIDIA Drivers**:

   * Ensure that the NVIDIA drivers on your host are up-to-date as per the requirements of CUDA 11.7.

5. **Test CUDA Setup in Docker**:

   * Run CUDA sample programs like `deviceQuery` inside the Docker container to check if the setup correctly recognizes and uses the GPU.

If you continue to face issues after following these steps, could you please provide the following additional information?

* The model of your NVIDIA GPU and the compute capability associated with it.

* The specific Docker command you are using to start your container.

* Any custom modifications or settings in your Dockerfile related to CUDA.

This information will help us provide a more targeted and effective solution. Hope this helps!

Thanks, I have solved the problem!