CQCL / pytket-cutensornet

cuTensorNet Python API extensions for pytket quantum SDK
Apache License 2.0
8 stars 1 forks source link

[bugfix] CuTensorNetHandle failure on multiple GPUs after cuTensorNet 2.3.0 #92

Closed PabloAndresCQ closed 3 months ago

PabloAndresCQ commented 3 months ago

Description

When running any simulation method that used the CuTensorNetHandle and with cuTensorNet>=2.3.0 installed, if you tried to use a GPU device that was not the default one (device=0) then you'd get a very obscure error.

It appears that the problem was caused due to newer versions of cuTensorNet make use of cupy internally, which needs its device being specified if not using the default one. We were already doing this anyway, but it turns out that the order of the commands was wrong, and cutn.create() which creates the cuTensorNet library handle was called before updating the cupy device, causing a mismatch in the device being used.

Checklist

PabloAndresCQ commented 3 months ago

Automatic testing for this would be hard, since it'd require a machine with multiple devices. I have tested this in the sense that I required this fix in order to be able to run some parallel tasks on Perlmutter (GPU cluster). I'd rather not add a test for this.