Closed github-actions[bot] closed 4 months ago
@Jutho could you please push this through. Thanks
I think @lkdvos checked and noticed that we cannot simply do this without having to also update our implementation/package extension, due to breaking changes.
Sadly it's quite a bit of work since cuTENSOR has changed their interface. It's definitely somewhere on my to do list, but for now I think cuTENSOR v1 works just fine?
I believe cuTensor restricts me to GPUArrays 9 which has a memory double free issue when using multiple threads. I was hoping to update to 10 but I believe this compat is restricting me.
If its a big change dont worry my code still runs all be it with a bunch of errors printing out.
Apparently this issue can cause crashes. To be clear this only happens when using TensorOperations (and more specifically GPUArrays.jl) inside of multiple separate threads. In my case I have one thread per GPU.
I started some work on moving to the new interface. I think it should be working for plain CuArray
s, but I am still deciding on how to implement views/stridedviews, so for now that will have to wait.
If you try it out, do let me know if there are any obvious errors?
I also just noticed that cuTENSOR 2 requires julia 1.8, which I am not too happy about. I think this means we either need to keep two different versions of TensorOperations, for 1.6-1.7 with cuTENSOR 1 and for 1.8+ with cuTENSOR 2, or I would have to come up with a way of keeping the old code if julia is below 1.8. Maybe we can consider also restricting to julia 1.8, but I didn't see the need to do that here just yet
I'll test it out, thanks for making some changes!
For the record here is the issue on GPUArrays: https://github.com/JuliaGPU/GPUArrays.jl/issues/503
Awaiting the result of https://github.com/JuliaGPU/CUDA.jl/pull/2356 to simplify the implementation further.
I think this is ready to go in principle. All it requires is the tagged version of the changes in cuTENSOR, so let's wait for that and then get this merged. I think we can still do a minor release upgrade for this, but we should probably get the v5 thing started asap.
The CUDA updates have been tagged, so this is only waiting for cuTENSOR to tag its updates now.
Some comments:
The cuTENSOR updates got tagged, this is now all good to go. @Jutho , shall I merge this?
Is it ok if I try to review tonight or tomorrow morning (it's mostly a means for me to see what was changed, so that I can keep track)? If I didn't succeed by tomorrow lunch time; feel free to merge.
Is is safe to use this?
It should be; we hope to release a v5 of TensorOperations soon.
This pull request changes the compat entry for the
cuTENSOR
package from1
to1, 2
. This keeps the compat entries for earlier versions.Note: I have not tested your package with this new compat entry. It is your responsibility to make sure that your package tests pass before you merge this pull request.