shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
668 stars 132 forks source link

[bug] Sleef_{sinhf, coshf}8_u10 doesn't match std::{sinh, cosh} #364

Open kshitij12345 opened 3 years ago

kshitij12345 commented 3 years ago
#include <x86intrin.h>
#include <cmath>
#include <math.h>
#include <iostream>
#include <sleef.h>
#include <complex>

void print_vec(__m256 vec, std::string msg){
    std::cout << msg << ":";
    std::cout << vec[0] << ",";
    std::cout << vec[1] << ",";
    std::cout << vec[2] << ",";
    std::cout << vec[3] << ",";
    std::cout << vec[4] << ",";
    std::cout << vec[5] << ",";
    std::cout << vec[6] << ",";
    std::cout << vec[7] << "\n";
}

void print_vec(__m256d vec, std::string msg){
    std::cout << msg << ":";
    std::cout << vec[0] << ",";
    std::cout << vec[1] << ",";
    std::cout << vec[2] << ",";
    std::cout << vec[3] << "\n";
}

__m256 simd_sinh(__m256 values){
    return Sleef_sinhf8_u10(values);
}

int main(){
    __m256 vec = {88, 89, 89.1, 89.4, 89.5, 90, 91, 92};
    print_vec(simd_sinh(vec), "SIMD");
    std::cout << "NON-VECTORIZED " << std::sinh(88.f) << ",";
    std::cout << std::sinh(89.f)  << ",";
    std::cout << std::sinh(89.1f)  << ",";
    std::cout << std::sinh(89.4f)  << ",";
    std::cout << std::sinh(89.5f)  << ",";
    std::cout << std::sinh(90.f)  << ",";
    std::cout << std::sinh(91.f)  << ",";
    std::cout << std::sinh(92.f)  << "\n";
    return 0;
}

Output:

SIMD:8.25818e+37,inf,inf,inf,inf,inf,inf,inf
NON-VECTORIZED 8.25818e+37,2.24481e+38,2.48089e+38,3.34886e+38,inf,inf,inf,inf
kshitij12345 commented 3 years ago

Reference: https://github.com/pytorch/pytorch/issues/48641

shibatch commented 3 years ago

As described in the specification, the valid input domain for sinhf function in SLEEF is [-88.5, 88.5]. I will fix it if this is really a problem in practical usage.

https://sleef.org/purec.xhtml#Sleef_sinh_u10

kshitij12345 commented 3 years ago

Ah right. Sorry didn't notice that.

As for the practical usage, in case of Pytorch it feels odd when vectorized (Sleef) path is taken and not taken.

>>> import torch
>>> a = torch.tensor([89.] * 15)
>>> a.sinh()
tensor([2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38,
        2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38, 2.2448e+38,
        2.2448e+38, 2.2448e+38, 2.2448e+38])
>>> a = torch.tensor([89.] * 16)
>>> a.sinh()
tensor([inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf])
shibatch commented 3 years ago

The values returned by SLEEF functions may differ slightly within the specified error range, even if only a vectorized path is taken.

kshitij12345 commented 3 years ago

Similar with cosh

As per docs, domain is indeed [-88.5, 88.5]

#include <x86intrin.h>
#include <cmath>
#include <math.h>
#include <iostream>
#include <sleef.h>
#include <complex>

void print_vec(__m256 vec, std::string msg){
    std::cout << msg << ":";
    std::cout << vec[0] << ",";
    std::cout << vec[1] << ",";
    std::cout << vec[2] << ",";
    std::cout << vec[3] << ",";
    std::cout << vec[4] << ",";
    std::cout << vec[5] << ",";
    std::cout << vec[6] << ",";
    std::cout << vec[7] << "\n";
}

void print_vec(__m256d vec, std::string msg){
    std::cout << msg << ":";
    std::cout << vec[0] << ",";
    std::cout << vec[1] << ",";
    std::cout << vec[2] << ",";
    std::cout << vec[3] << "\n";
}

__m256 simd_cosh(__m256 values){
    return Sleef_coshf8_u10(values);
}

int main(){
    __m256 vec = {88, 89, 89.1, 89.4, 89.5, 90, 91, 92};
    print_vec(simd_cosh(vec), "SIMD");
    std::cout << "NON-VECTORIZED " << std::cosh(88.f) << ",";
    std::cout << std::cosh(89.f)  << ",";
    std::cout << std::cosh(89.1f)  << ",";
    std::cout << std::cosh(89.4f)  << ",";
    std::cout << std::cosh(89.5f)  << ",";
    std::cout << std::cosh(90.f)  << ",";
    std::cout << std::cosh(91.f)  << ",";
    std::cout << std::cosh(92.f)  << "\n";
    return 0;
}

Output

SIMD:8.25818e+37,inf,inf,inf,inf,inf,inf,inf
NON-VECTORIZED 8.25818e+37,2.24481e+38,2.48089e+38,3.34886e+38,inf,inf,inf,inf
kshitij12345 commented 3 years ago

The values returned by SLEEF functions may differ slightly within the specified error range, even if only a vectorized path is taken.

Right. Will talk to someone from Pytorch core once.

Thank you for the quick and helpful response.