Open fkendlessly opened 2 years ago
There is likely nothing I can do to make this any faster. In general, a process needs some time to initialize the GPU.
I think you can try running torch.cuda.init()
and you will probably see that this operation takes ~4 seconds.
When I run torch.cuda.init()
, it only takes 1e-5 seconds. Actually, I found out that the above code initialization is related to the estimator.py in the .../pycave/clustering/kmeans directory. I tested the 129th line of estimator.py, self.trainer(max_epochs=num_epochs).fit(module, loader)
, which took about 4 seconds.
Can you also benchmark torch.empty(1).cuda()
? I thought that torch.cuda.init()
is the culprit but I'm quite certain that the delay is the first interaction with the GPU (I just don't know for sure when it's happening).
torch.empty(1).cuda() takes about 0.4 milliseconds.
Mh ok, interesting. I don't think it has anything to do with PyCave but I will check again. Unfortunately, I don't have direct access to a GPU at the moment.
Hi @borchero, I am using the GPU for clustering or GMM and the initialization operation takes a long time compared to the CPU. After executing the following code segment on the RTX3090, the GPU initialization time is about 4.1 seconds. However, the CPU only takes about 0.17 seconds. Any suggestions to solve this problem?