HKBU-HPML / FADNet

135 stars 34 forks source link

FADNet is extremely slow despite enameling GPU #22

Open Genozen opened 1 year ago

Genozen commented 1 year ago

Settings: Nano Jetson Xavier No GPU: NVidia Volta Power mode: 20W 6 Core PyTorch 1.13 CUDA: 11.4

——————

We were able to run FADNet offline with sample dataset, but it was extremely slow (4~5 frames per second generated) on a input resolution of 512x256

We’d like to know what is the bottle neck that’s running it this slow as FADNet claims to be fast and accurate, so we must have done something wrong…

I can provide the code if needed, but just want to get a general idea first of how fast the FADNet can actually run ideally

—————

blackjack2015 commented 1 year ago

Dear Genozen,

Thanks for your interest. Notice that our FADNet can only run fast as ~15 fps on a desktop/server GPU. For mobile GPUs like Jetson devices, the code without optimization can be very slow. Some tools like TensorRT might help. Besides, you can also try our latest FADNet++, which has different variants for different computing devices.

https://github.com/HKBU-HPML/FADNet-PP

We successfully run a tiny version of FADNet on Jetson AGX and get an EPE of 1.19 with 12 fps.

Best regards, Qiang Wang

Genozen commented 1 year ago

Hi, Qiang Wang. Thanks for your quick reply.

We actually tried the FADNet++ as well, but the pre-trained model posted was still the original FADNet. Any chance you could update the FADNet pre-trained model? Would really love to try it out.