hmorimitsu / ptlflow

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.
Apache License 2.0
250 stars 33 forks source link

Performance Improvement: Offloading IOAdapter Resize and FlowFormer++ Compute Weight to CUDA #58

Closed coca-huang closed 6 months ago

coca-huang commented 6 months ago

Summary

This pull request introduces a significant performance improvement by offloading the resize operation in IOAdapter and the compute_weight operation in FlowFormer++ to CUDA, whenever CUDA is available. This change aims to reduce CPU usage substantially and leverage GPU acceleration for enhanced processing speed.

Changes

hmorimitsu commented 6 months ago

Thank you for the PR. Could you also update the compute_weight function in flowformer and matchflow models?

hmorimitsu commented 6 months ago

Can you also add the argument indexing="ij" in torch.meshgrid to avoid warnings?