xformers for potential speedup, or torch 2.01 arguments

parkchamchi / DepthViewer

Unity program that creates 3D scenes for VR using MiDaS model

MIT License

65 stars 5 forks source link

xformers for potential speedup, or torch 2.01 arguments #17

Open 311-code opened 7 months ago

311-code commented 7 months ago

I had read that using xformers (pip install xformers) could possibly results in a large speedup in the marigold and depth-anything realtime conversion. The issue is I can't find any xformers wheel that is compatible with 2.0.1+cu117 (cuda 11.7) and not sure if the unity project requires that version of cuda to work.

It seems like xformers version 22 is compatible possibly with torch 2.01 and cuda 11.8.

If this doesn't work though because it's too old: I had read an argument you can do with torch 2.01 that would be as fast of a speedup as xformers is adding --opt-sdp-attention or --opt-sdp-no-mem-attention arguments (but these seem specific flags only for automatic1111 I am wondering if the same sort of thing could be done here?)

I still can't get the depth-anything model going quite yet to test though. Somehow threedeejay did but he says it runs at 2 frames per second.

parkchamchi commented 7 months ago

Hmm, I wonder if dany is using or can use the half precision optimization?

Higher version of torch would work with the scripts. One thing to consider is that if the current Unity OnnxRuntime dlls (v1.13.1) would work with the other CUDA/cuDNN version. The ORT docs (#) list the ORT v1.13.1 requires CUDA v11.6 and cuDNN v8.5.0.96. But the fact that I use it with CUDA v11.7 and cuDNN v8.2.4, and that the doc saying:

Note: Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility.

I think it is safe to say you can upgrade your CUDA version.

--opt-sdp-attention or --opt-sdp-no-mem-attention

I do not know how those args would work.

311-code commented 7 months ago

I will look into this and try it out and report back.

ricardofeynman commented 7 months ago

--opt-sdp-attention or --opt-sdp-no-mem-attention

AFAIK these args are relevent only to Stable Diffusion to speed up image generation. At least I've never encountered them in any other context.

Migrated to Sentis From Barracuda v3.0. Now supports MiDaS v3+ and Depth-Anything models without ORT.

Where's that little mind exploding emoji when you need it. This, good sir, is some very exciting news. Must test soon.

311-code commented 7 months ago

xformers

Threedeejay sent me this and someone got a little further, I wonder if they just need to do pip install https://download.pytorch.org/whl/cu118/xformers-0.0.23.post1%2Bcu118-cp311-cp311-win_amd64.whl for a compatible version possibly?

I'm currently trying to get the Unity 2022.3.18f1 project going at this time. A lot of features from meta which are finally exposing the hand tracking, and other features to openxr fully for PCVR with the v62 update. (I have it early by opting into the public test channels on pc and the mobile app)

parkchamchi commented 7 months ago

If you have python 3.11 and cuda 11.8, I assume.