Use rlibm for faster and more accurate floating point operations

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

https://pytorch.org

Other

82.31k stars 22.14k forks source link

Use rlibm for faster and more accurate floating point operations #49666

Open ezyang opened 3 years ago

ezyang commented 3 years ago

It looks like they are faster and more accurate. Paper describing the approach at https://arxiv.org/pdf/2007.05344.pdf

ezyang commented 3 years ago

They have an updated PLDI'21 paper https://arxiv.org/pdf/2104.04043.pdf which extends their technique to work on float32. Anywhere in PyTorch where we are using libm for floating point operations (I'm actually not sure exactly what we are using) we should use rlibm-32 instead. The paper also gives us a path to emulating posits on CPU (@wickedfoo you might like that!)

malfet commented 3 years ago

There seems to be two libraries https://github.com/rutgers-apl/rlibm and https://github.com/rutgers-apl/rlibm-32 First one have some float16 and bfloat16 optimized functions (including sqrt), while 2nd iteration focuses on floats and posits

ezyang commented 3 years ago

Also, although the implementations compare well to other precise implementations (like libm), they still do "slowish" operations (including conversion to double precision) so they may not compare so well to libraries that are OK with more ulps of precision loss.