ROCm / HIP-CPU

An implementation of HIP that works on CPUs, across OSes.
MIT License
112 stars 19 forks source link

inconsistent warpSize in host and "device" code #27

Open jakub-homola opened 3 years ago

jakub-homola commented 3 years ago

The values of warpSize read from the hipDeviceProps_t variable and the kernel builtin variable warpSize are different, which is very unexpected.

Consider the following HIP program:

#include <cstdio>
#include <hip/hip_runtime.h>

__global__ void print_warp_size_kernel()
{
    printf("DEVICE warpSize = %d\n", warpSize);
}

int main()
{
    hipDeviceProp_t prop;
    hipGetDeviceProperties(&prop, 0);
    printf("HOST   warpSize = %d\n", prop.warpSize);

    hipLaunchKernelGGL(print_warp_size_kernel, 1, 1, 0, 0);

    hipDeviceSynchronize();

    return 0;
}

compiled using the command

g++ -std=c++17 -I/home/jakub/apps/HIP-CPU/include source.hip.cpp -o program.x -ltbb -pthread

When I run the program, this is the output:

HOST   warpSize = 4
DEVICE warpSize = 64

I don't care whether it is 4 or 64 or whatever power of 2, we should be creating wave-aware code (warpSize independent code) anyway, but this behaviour seems really wrong.

I am using the current master branch of the HIP-CPU library.