google / gemmlowp

Low-precision matrix multiplication
Apache License 2.0
1.77k stars 451 forks source link

Add a stage truncating int32 to uint8. #173

Closed alexfru closed 5 years ago

alexfru commented 5 years ago

This stage can save time if used instead of the OutputStageSaturatingCastToUint8 stage immediately after the OutputStageClamp stage.