How to get best performace with optimization with torch_blade

alibaba / BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

Apache License 2.0

780 stars 159 forks source link

How to get best performace with optimization with torch_blade #1294

Open JackWeiw opened 2 months ago

JackWeiw commented 2 months ago

My script compare Torch with Disc

However according to profiling , Disc boost litle compare to Torch, even in some dynamic range worse than torch,. Is there some i missing to get optimal performance of Disc? h5120_i13824_n40_2 h4096_i16384_n32_2 h4096_i11008_n32_2

JackWeiw commented 2 months ago

I updated my script like examples in Disc torch inference do, another problem occured

your kindly help is much appriciated!!! @Yancey1989 @eedalong

JackWeiw commented 1 month ago

I passed the half precission model to blade_disc, however, the saved optimized model by blade_disc is fp32, how come?