Performance Improvement: Offloading IOAdapter Resize and FlowFormer++ Compute Weight to CUDA

coca-huang commented 6 months ago

Summary

This pull request introduces a significant performance improvement by offloading the resize operation in IOAdapter and the compute_weight operation in FlowFormer++ to CUDA, whenever CUDA is available. This change aims to reduce CPU usage substantially and leverage GPU acceleration for enhanced processing speed.

Changes

Modified the resize method in IOAdapter to detect CUDA availability and execute on GPU when possible.
Updated FlowFormer++'s compute_weight function to perform computations on CUDA instead of the CPU.

hmorimitsu commented 6 months ago

Thank you for the PR. Could you also update the compute_weight function in flowformer and matchflow models?

hmorimitsu commented 6 months ago

Can you also add the argument indexing="ij" in torch.meshgrid to avoid warnings?

hmorimitsu / ptlflow

Performance Improvement: Offloading IOAdapter Resize and FlowFormer++ Compute Weight to CUDA #58

Summary

Changes