valyala / fasthttp

Fast HTTP package for Go. Tuned for high performance. Zero memory allocations in hot paths. Up to 10x faster than net/http
MIT License
21.91k stars 1.76k forks source link

Improve performance of ParseUfloat #1865

Closed ksw2000 closed 2 months ago

ksw2000 commented 2 months ago

Replaced offset handling logic with more efficient math.Pow10 based calculation.

The math.Pow10 function performs calculations using a lookup table, which is more efficient compared to the original method of continuous division. By switching to simple addition and subtraction operations, the execution time is reduced by approximately 30%.

goos: linux
goarch: amd64
pkg: github.com/valyala/fasthttp
cpu: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
              │   old.txt   │           new.txt           │
              │   sec/op    │   sec/op     vs base        │
ParseUfloat-8   44.22n ± 0%   31.06n ± 0%  -29.76% (n=50)
erikdubbelboer commented 2 months ago

This is the first time I'm noticing that our ParseUfloat actually returns the wrong values for very high numbers. Even though it's faster than strconv.ParseFloat. I still think we should switch to strconv.ParseFloat as being incorrect like this is very bad. What do you think?

1234e23
ParseUfloat:        123399999999999981930938368.000000
strconv.ParseFloat: 123399999999999999110807552.000000

1.234e+32
ParseUfloat:        123399999999999984608249181634560.000000
strconv.ParseFloat: 123400000000000002622647691116544.000000

123456789123456789.987654321
ParseUfloat:        17502027876.294655
strconv.ParseFloat: 123456789123456784.000000
ksw2000 commented 2 months ago

Actually, I encountered this problem as well. I believe the reason is that we use a uint64 variable to represent a floating-point number. The largest value of uint64 is 1.8×10¹⁹. Therefore, the third example you provided, 123456789123456789.987654321, which has 27 digits, causes the uint64 value to overflow.

erikdubbelboer commented 2 months ago

This is currently not documented at all. I'm thinking I would rather have the function return the correct values even for big numbers than have it be 20ns faster. Just using strconv.ParseFloat would also result in less code to maintain. What do you think?

ksw2000 commented 2 months ago

Parse the float string is a complex topic. In strconv.ParseFloat, they use IEEE754 unbiased rounding. In our implementation, we directly truncate the digits after the 19th position. Although our implementation doesn't achieve the same high precision as strconv.ParseFloat, the error is within an acceptable range for network applications. I think we can add the comments that our ParseUfloat is much faster than strconv.ParseFloat, but the results are less accurate.

goos: windows
goarch: amd64
pkg: github.com/valyala/fasthttp
cpu: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
BenchmarkParseUfloat-8                  13448752                78.83 ns/op
BenchmarkParseUfloatStrconv-8            4897358               248.1 ns/op
PASS
ok      github.com/valyala/fasthttp     2.738s
1234e23
ParseUfloat:        1.2339999999999998e+26
strconv.ParseFloat: 1.234e+26

1.234e+32
ParseUfloat:        1.2339999999999998e+32
strconv.ParseFloat: 1.234e+32

123456789123456789.987654321
ParseUfloat:        1.2345678912345678e+17
strconv.ParseFloat: 1.2345678912345678e+17
erikdubbelboer commented 2 months ago

There are other inconsistencies as well. For example this test fails because it returns 0 instead of 1 like strconv.ParseFloat does.

testParseUfloatSuccess(t, "00000000000000000001", 1)

Or parsing 87000000000.2 results in 87000000000.200012 while strconv.ParseFloat returns 87000000000.199997 which is a bigger difference for such a low number than I would like.

ksw2000 commented 2 months ago

In the first case 00000000000000000001, we can add some code to remove the leading zeros. As for the second case 87000000000.2, the relative error between ParseUfloat and strconv.ParseFloat is only 1.75e-16.

ksw2000 commented 2 months ago

The second case is very strange. Our algorithm calculates using float64(870000000002) * math.Pow10(-1), but if we change to using division, the result will be the same as strconv.ParseFloat.

fmt.Printf("%f %f\n", float64(870000000002)*math.Pow10(-1), float64(870000000002)/math.Pow10(1))
87000000000.200012 87000000000.199997
erikdubbelboer commented 2 months ago

71000000000000.1 results in a 1.562500e-02 relative difference. I'm trying to fuzz test the implementations against each other and I keep finding differences. There is also how they handle numbers with E or X. Or strconv.ParseFloat sometimes returning NaN instead of an error. There are so many differences that I really don't think it's worth the 20ns (nanoseconds) overhead. Unless someone is calling this function billions of times it's not going to make a difference. And if someone does that it's much better for them to use a more specific version in their own codebase.

ksw2000 commented 2 months ago

The example you provided, using Relative Error = (expected value - actual value) / actual value, the error is indeed very small, all below 10^-15. However, if our function cannot perfectly handle special cases like NaN, I agree with your suggestion. Perhaps we should consider directly calling strconv.ParseFloat, or rewriting our function based on strconv.ParseFloat. What do you think?

erikdubbelboer commented 2 months ago

Lets just call strconv.ParseFloat and include some error handling for negative numbers to keep that backwards compatible.

erikdubbelboer commented 2 months ago

Thanks!