Jutho / TensorOperations.jl

Julia package for tensor contractions and related operations
https://jutho.github.io/TensorOperations.jl/stable/
Other
452 stars 56 forks source link

Update to cuTENSOR 2.0 #160

Closed github-actions[bot] closed 4 months ago

github-actions[bot] commented 8 months ago

This pull request changes the compat entry for the cuTENSOR package from 1 to 1, 2. This keeps the compat entries for earlier versions.

Note: I have not tested your package with this new compat entry. It is your responsibility to make sure that your package tests pass before you merge this pull request.

ejmeitz commented 8 months ago

@Jutho could you please push this through. Thanks

Jutho commented 8 months ago

I think @lkdvos checked and noticed that we cannot simply do this without having to also update our implementation/package extension, due to breaking changes.

lkdvos commented 8 months ago

Sadly it's quite a bit of work since cuTENSOR has changed their interface. It's definitely somewhere on my to do list, but for now I think cuTENSOR v1 works just fine?

ejmeitz commented 8 months ago

I believe cuTensor restricts me to GPUArrays 9 which has a memory double free issue when using multiple threads. I was hoping to update to 10 but I believe this compat is restricting me.

If its a big change dont worry my code still runs all be it with a bunch of errors printing out.

ejmeitz commented 7 months ago

Apparently this issue can cause crashes. To be clear this only happens when using TensorOperations (and more specifically GPUArrays.jl) inside of multiple separate threads. In my case I have one thread per GPU.

image

lkdvos commented 7 months ago

I started some work on moving to the new interface. I think it should be working for plain CuArrays, but I am still deciding on how to implement views/stridedviews, so for now that will have to wait. If you try it out, do let me know if there are any obvious errors?

lkdvos commented 7 months ago

I also just noticed that cuTENSOR 2 requires julia 1.8, which I am not too happy about. I think this means we either need to keep two different versions of TensorOperations, for 1.6-1.7 with cuTENSOR 1 and for 1.8+ with cuTENSOR 2, or I would have to come up with a way of keeping the old code if julia is below 1.8. Maybe we can consider also restricting to julia 1.8, but I didn't see the need to do that here just yet

ejmeitz commented 7 months ago

I'll test it out, thanks for making some changes!

For the record here is the issue on GPUArrays: https://github.com/JuliaGPU/GPUArrays.jl/issues/503

lkdvos commented 5 months ago

Awaiting the result of https://github.com/JuliaGPU/CUDA.jl/pull/2356 to simplify the implementation further.

lkdvos commented 5 months ago

I think this is ready to go in principle. All it requires is the tagged version of the changes in cuTENSOR, so let's wait for that and then get this merged. I think we can still do a minor release upgrade for this, but we should probably get the v5 thing started asap.

lkdvos commented 4 months ago

The CUDA updates have been tagged, so this is only waiting for cuTENSOR to tag its updates now.

Some comments:

lkdvos commented 4 months ago

The cuTENSOR updates got tagged, this is now all good to go. @Jutho , shall I merge this?

Jutho commented 4 months ago

Is it ok if I try to review tonight or tomorrow morning (it's mostly a means for me to see what was changed, so that I can keep track)? If I didn't succeed by tomorrow lunch time; feel free to merge.

ejmeitz commented 3 months ago

Is is safe to use this?

Jutho commented 3 months ago

It should be; we hope to release a v5 of TensorOperations soon.