Open staskh opened 2 years ago
This may caused by the limitation of m_capacity.
When the program was applying WorkQueue(uint32) m_capacity / grid_threads
.
Although the value of grid_threads was calculated by cuda&gpu, the m_capacity was limited.
Thus, the number of queue would be zero when grid_threads was larger than m_capacity.
You can try this modification:
nvbio-master\nvbio\basic\cuda\work_queue_multipass_inl.h, line233:
const uint32 n_tile_grids = m_capacity / grid_threads;
To:
const uint32 n_tile_grids = nvbio::max(m_capacity, grid_threads) / grid_threads;
\\ ensure the value of m_capacity is greater than grid_threads.
This may caused by the limitation of m_capacity. When the program was applying WorkQueue, the consume function would calculate the available number of queue via
(uint32) m_capacity / grid_threads
. Although the value of grid_threads was calculated by cuda&gpu, the m_capacity was limited. Thus, the number of queue would be zero when grid_threads was larger than m_capacity.You can try this modification: nvbio-master\nvbio\basic\cuda\work_queue_multipass_inl.h, line233:
const uint32 n_tile_grids = m_capacity / grid_threads;
To:const uint32 n_tile_grids = nvbio::max(m_capacity, grid_threads) / grid_threads;
\\ ensure the value of m_capacity is greater than grid_threads.
Hi there, I tried this without success and I am still getting the same error. Another other suggestions that you have would be greatly appreciated, thanks in advance!
OS: Centos7 NVCC: 10.0.130 CUDA Version: 10 GCC: 7.3.10 GPU: Tesla P100-PCIE-12GB
This may caused by the limitation of m_capacity. When the program was applying WorkQueue, the consume function would calculate the available number of queue via
(uint32) m_capacity / grid_threads
. Although the value of grid_threads was calculated by cuda&gpu, the m_capacity was limited. Thus, the number of queue would be zero when grid_threads was larger than m_capacity. You can try this modification: nvbio-master\nvbio\basic\cuda\work_queue_multipass_inl.h, line233:const uint32 n_tile_grids = m_capacity / grid_threads;
To:const uint32 n_tile_grids = nvbio::max(m_capacity, grid_threads) / grid_threads;
\\ ensure the value of m_capacity is greater than grid_threads.
Hi there, I tried this without success and I am still getting the same error. Another other suggestions that you have would be greatly appreciated, thanks in advance!
OS: Centos7 NVCC: 10.0.130 CUDA Version: 10 GCC: 7.3.10 GPU: Tesla P100-PCIE-12GB
You could check the values of grid_threads and queue_capacity after line 234 in work_queue_multipass_inl.h and make sure they are larger then zero.
info : work_queue test... started info : testing multi-pass work-queue: error : caught a std::runtime_error exception: error : trivial_device_copy D->H failed: cudaErrorInvalidValue: invalid argument
OS: Amazon Linux 2 NVCC: V11.3.109 CUDA Version: 11.4 GCC: 7.3.1 GPU: Tesla T4