Open leofang opened 4 weeks ago
Can't we just say the pool is not available on Windows in TCC mode? I don't think we need to go above and beyond to support something the driver doesn't support.
It is not appropriate because CUDA does support Windows TCC mode, just not the mempool. Right now cuda.core
is not functional at all only because I forgot (😞) mempools are not there, but we can easily provide a fallback path to make it work (by wrapping cudaMalloc
/cudaFree
as suggested in #208).
@jrhemstad I suggest us to take this seriously if we want CUDA Mode to succeed, as we have many Windows TCC users in the LLM space, and they all hit this issue (it only took me 1 min to quickly google these):
Another reason that TCC is important is because it's the default of GHA Windows GPU runner, e.g.: https://github.com/cupy-ci-poc/cupy/actions/runs/12004661559/job/33459987560#step:5:15
This issue tracks an internal discussion with QA. This simple snippet shows why using
cuda.core
today on Windows might fail, depending on if it's TCC or WDDM mode:cuda.core
currently assumes the stream-ordered memory allocator is available. However, CUDA on Windows is a bit more complicated than on Linux, since there are two operation modes:cuda.core
development), things should work just fine.We need some treatments to make it usable on TCC.