DerryHub / BEVFormer_tensorrt

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).
Apache License 2.0
424 stars 69 forks source link

benchmark on orin #75

Open shuyuan-wang opened 1 year ago

shuyuan-wang commented 1 year ago

has anyone successfully deployed on orin? what's the infer time like?

jongsik-moon commented 1 year ago

I've deployed this model on xavier agx. The FPS is about 1Hz for fp16/small model.

superpigforever commented 1 year ago

I've deployed this model on xavier agx. The FPS is about 1Hz for fp16/small model.

Well this looks slow, have you tried the int8 version?

jongsik-moon commented 1 year ago

I've tried with tiny model, but failed to convert the model into tensorrt engine. I'm waiting for jetpack6 to use tensorrt 8.6 version. In x86 machine, tiny+int8 model is faster about 5x times than small+fp16 model. Maybe you can guess the fps.

superpigforever commented 1 year ago

I've tried with tiny model, but failed to convert the model into tensorrt engine. I'm waiting for jetpack6 to use tensorrt 8.6 version. In x86 machine, tiny+int8 model is faster about 5x times than small+fp16 model. Maybe you can guess the fps.

Thanks

xiaohangimg commented 2 months ago

has anyone successfully deployed on orin? what's the infer time like? I successfully deployed BEVFormer-tiny (int8) on Orin NX (100 TOPS). After some calculations, using the custom operators in this project, the model inference speed is approximately 10 Hz, but including the pre- and post-processing parts, it drops to 5 Hz. I would like to ask if you have any relevant references or insights on why the inference speed is so slow? Could it be related to the transformer architecture? I look forward to your suggestions!

leaves369 commented 6 days ago

I've deployed this model on xavier agx. The FPS is about 1Hz for fp16/small model.

Hi, how did you deployed on xavier agx; Did you convert to trt on x86 and test on agx successfully?