Closed PabloAndresCQ closed 3 months ago
Automatic testing for this would be hard, since it'd require a machine with multiple devices. I have tested this in the sense that I required this fix in order to be able to run some parallel tasks on Perlmutter (GPU cluster). I'd rather not add a test for this.
Description
When running any simulation method that used the
CuTensorNetHandle
and with cuTensorNet>=2.3.0 installed, if you tried to use a GPU device that was not the default one (device=0) then you'd get a very obscure error.It appears that the problem was caused due to newer versions of cuTensorNet make use of
cupy
internally, which needs its device being specified if not using the default one. We were already doing this anyway, but it turns out that the order of the commands was wrong, andcutn.create()
which creates the cuTensorNet library handle was called before updating thecupy
device, causing a mismatch in the device being used.Checklist