aldanor / fast-float-rust

Super-fast float parser in Rust (now part of Rust core)
https://docs.rs/fast-float
Apache License 2.0
275 stars 20 forks source link

Support for JSON Numbers #25

Open Alexhuszagh opened 3 years ago

Alexhuszagh commented 3 years ago

Adds support for #17.

This probably needs significant edits prior to merging, but the benchmarks with the primitive tokenizing (it doesn't use an optimized memchr) still are pretty good (it also has to duplicate a lot of work when parsing, so this isn't really that surprising):

$ /home/ahuszagh/git/fast-float-rust/target/release/fast-float-simple-bench file ext/data/canada.txt 
=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            25.84    26.30    27.14    27.64    28.17    29.85    40.15 |
| fast-float-tokenized    44.33    45.83    46.25    46.71    47.36    49.06    88.16 |
| lexical               74.67    76.35    76.96    77.64    78.73    81.79   132.29 |
| from_str             218.95   220.82   222.62   224.34   226.81   237.21   407.54 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            24.90    33.59    35.51    36.18    36.85    38.02    38.69 |
| fast-float-tokenized    11.34    20.38    21.11    21.41    21.62    21.82    22.56 |
| lexical                7.56    12.23    12.70    12.88    12.99    13.10    13.39 |
| from_str               2.45     4.22     4.41     4.46     4.49     4.53     4.57 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           433.38   584.58   618.00   629.61   641.32   661.66   673.33 |
| fast-float-tokenized   197.39   354.72   367.43   372.58   376.30   379.67   392.55 |
| lexical              131.54   212.84   221.04   224.14   226.11   227.91   233.05 |
| from_str              42.70    73.39    76.73    77.57    78.17    78.81    79.47 |
|                                                                                   |
=====================================================================================

I've added tests to ensure the differences in parsing don't lead to correctness issues, as seen here.

The following methods have been added:

pub trait FastFloat {
    fn parse_from_parts<S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> Self;
}

pub fn parse_from_parts<T: FastFloat, S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> T;

In order to share more code, parse_long_mantissa has been changed to accept a Decimal by-value, and the two internal functions have been added:

pub fn parse_number_from_parts(i: &[u8], f: &[u8], e: i64, negative: bool) -> Option<Number>;
pub fn parse_decimal_from_parts(mut i: &[u8], mut f: &[u8], e: i64, negative: bool) -> Decimal;

Any feedback would be great, I'm probably going to need to refactor a bit to increase code re-use. The API can therefore be used like this:

// This is equivalent to parse("-42823146028335318693e-128")
let f: f64 = parse_from_parts("42823146028335318693", "", -128, true);

If an invalid character is found, it merely stops parsing early: no warning is given for invalid input, but it will not undergo any unsafe behavior. It assumes the input is valid, and the documentation clearly reflects this:

/// Parse a pre-tokenized decimal number from string into float.
///
/// This assumes the float has already been tokenized into valid
/// integral and fractional components, and has parsed an optional
/// exponent notation.
///
/// It is up to you to validate and tokenize the input: although
/// this will not error, this might truncate the significant
/// digits as soon as an invalid digit is found. This does not
/// handle special values, such as NaN, INF, or Infinity.
#[inline]
pub fn parse_from_parts<T: FastFloat, S: AsRef<[u8]>>(integral: S, fractional: S, exponent: i64, negative: bool) -> T;
Alexhuszagh commented 3 years ago

I've updated the PR to do the following:

This should be a much more satisfactory PR right now. The amount of additional code is much smaller, and mostly in unavoidable places.

aldanor commented 3 years ago

Hi @Alexhuszagh - thanks for taking time in getting a stab at this. I think the idea is good (and surely +1 for the pre-RFC - let's see how it will be received...).

Before we get into code structure discussions etc though, I've tried running the bench suite a few times for this branch and master, results seems to be fairly consistent. In particular for mesh.txt and canada.txt (the latter probably the most representative of what the most 'common' floats to be parsed would look like), there seems to be a ~7% performance drop which is outside of noise bounds.

There might be bias because the master branch code is heavily hand-tuned towards high performance on these benchmarks (on my machine in particular), but, on the other hand, medians outside of IQR signify that it's not just noise. Would you perhaps care to run these a few times on your machine to see how the numbers look like? If it's significant indeed, we should probably investigate what it's caused by? Because the code is logically identical, but the order of things may be slightly changed.

master branch:

| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| mesh.txt              10.41    10.43    10.49    10.84    11.67    12.89    16.63 |
| canada.txt            21.77    21.83    22.57    23.41    24.41    26.21    33.01 |
| uniform               19.56    19.82    20.93    22.75    26.15    31.25    43.28 |
| one_over_rand32       29.24    29.32    30.64    31.83    33.54    37.35    56.22 |
| simple_uniform32      19.53    19.58    20.03    20.82    22.16    24.80    45.92 |
| simple_int32          15.89    15.92    16.20    17.20    18.22    19.93    31.27 |
| int_e_int             27.28    27.38    28.40    29.80    31.39    36.58    51.47 |
| simple_int64          25.68    25.73    26.61    27.62    29.14    31.91    52.64 |
| bigint_int_dot_int    40.76    40.86    41.89    43.28    45.07    48.16    58.14 |
| big_ints              83.95    85.00    87.83    90.10    92.70    98.71   117.15 |

this branch:

| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| mesh.txt              11.07    11.15    11.25    11.62    12.34    13.67    22.26 |
| canada.txt            23.44    23.52    24.33    25.13    26.17    28.18    37.90 |
| uniform               19.37    19.55    20.99    22.96    25.57    31.03    47.99 |
| one_over_rand32       29.42    29.82    30.91    32.09    33.78    41.18    72.31 |
| simple_uniform32      19.37    19.52    20.03    21.06    22.46    26.23    69.84 |
| simple_int32          16.28    16.34    17.03    18.12    19.52    25.08    35.35 |
| int_e_int             27.17    27.33    28.12    29.35    31.00    36.11    64.15 |
| simple_int64          25.92    26.01    27.11    28.37    30.25    35.24    59.69 |
| bigint_int_dot_int    40.76    41.21    42.49    44.14    46.23    51.62    75.35 |
| big_ints              81.36    83.12    86.57    89.38    92.65   102.98   138.79 |
Alexhuszagh commented 3 years ago

@aldanor Definitely. This might also be due to bad inlining. I'll try changing a few inlines to #[inline(always)], to ensure they're identical to the code they're replacing, since 7% sounds well outside of noise benchmarks.

Alexhuszagh commented 3 years ago

Ok this is definitely slower and inling doesn't seem to work. Getting repeated ~7% performance hits with canada.txt, so I'll try to triage now.

aldanor commented 3 years ago

Thanks for testing out :)

Yea, I've spent lots and lots of time (the majority other that correctness) in this crate trying to speed this up by making inlining right and avoid every single bound check where possible (hence all the weird types in common module), so reshuffling things without profiling will likely slow it down.

But I think it should be doable since it's essentially the same logic.

Alexhuszagh commented 3 years ago

Thanks for testing out :)

Yea, I've spent lots and lots of time (the majority other that correctness) in this crate trying to speed this up by making inlining right and avoid every single bound check where possible (hence all the weird types in common module), so reshuffling things without profiling will likely slow it down.

But I think it should be doable since it's essentially the same logic.

It's a false positive: it's when both functions are used in the same binary, it messes with the inlining thresholds. I've added a feature gate use_tokenized for the benchmark, and if only parse is used, it goes back to normal speeds. Since we can expect a single binary to only use one-or-the-other in a single compilation unit, I think this is fine.

Here are a few sample benchmarks on my end:

Note: All the benchmarks are run on x86_64 Linux, with only tmux and Sublime Text open at a time, with the performance governor set.

Master

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.07    27.30    27.68    28.04    28.44    29.54    40.41 |
| lexical               75.74    76.08    76.57    77.07    77.95    80.83    99.25 |
| from_str             207.82   208.48   209.31   210.52   212.41   217.33   245.38 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            24.75    33.86    35.16    35.67    36.12    36.63    36.94 |
| lexical               10.08    12.38    12.83    12.98    13.06    13.15    13.20 |
| from_str               4.08     4.60     4.71     4.75     4.78     4.80     4.81 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           430.61   589.17   611.91   620.63   628.61   637.40   642.84 |
| lexical              175.33   215.35   223.29   225.79   227.28   228.75   229.76 |
| from_str              70.92    80.08    81.93    82.66    83.14    83.47    83.73 |
|                                                                                   |
=====================================================================================

JSON Branch

Shows a clear 3% performance hit on the median, and much worse in other cases.

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            28.67    28.74    28.82    28.93    29.45    30.90    36.30 |
| fast-float-tokenized    43.63    43.87    44.24    44.61    45.19    46.91    53.80 |
| lexical               73.79    74.00    74.13    74.54    75.82    78.61   105.16 |
| from_str             211.56   212.06   212.76   213.85   216.63   221.97   247.44 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.55    32.40    33.96    34.57    34.70    34.80    34.87 |
| fast-float-tokenized    18.59    21.32    22.13    22.42    22.60    22.79    22.92 |
| lexical                9.51    12.72    13.19    13.42    13.49    13.51    13.55 |
| from_str               4.04     4.51     4.62     4.68     4.70     4.72     4.73 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           479.34   563.81   590.90   601.60   603.83   605.54   606.86 |
| fast-float-tokenized   323.45   370.94   385.14   390.12   393.34   396.62   398.88 |
| lexical              165.48   221.37   229.51   233.46   234.74   235.16   235.81 |
| from_str              70.33    78.43    80.33    81.37    81.79    82.06    82.25 |
|                                                                                   |
=====================================================================================

JSON Without use_tokenized

Shows the 8% performance hit is undone (this is consistent).

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            26.36    26.39    26.46    26.64    27.39    28.94    52.07 |
| lexical               73.45    73.51    73.62    74.01    75.21    77.79    90.84 |
| from_str             210.11   210.89   212.18   214.25   219.13   232.37   263.52 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            19.21    34.57    36.51    37.54    37.80    37.89    37.94 |
| lexical               11.01    12.86    13.30    13.51    13.58    13.60    13.61 |
| from_str               3.79     4.31     4.56     4.67     4.71     4.74     4.76 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           334.20   601.57   635.33   653.25   657.77   659.32   660.18 |
| lexical              191.55   223.73   231.39   235.11   236.38   236.72   236.92 |
| from_str              66.04    75.00    79.42    81.22    82.01    82.52    82.82 |
|                                                                                   |
=====================================================================================

JSON With use_tokenized

Shows the performance hit is incurred again.

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            27.65    27.82    27.96    28.10    28.52    29.78    34.61 |
| fast-float-tokenized    42.87    43.32    43.79    44.28    45.01    46.22    60.92 |
| lexical               73.77    73.95    74.11    74.53    75.59    77.92    87.04 |
| from_str             201.66   203.34   204.56   205.75   207.54   213.01   242.14 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            28.90    33.59    35.06    35.59    35.76    35.94    36.16 |
| fast-float-tokenized    16.41    21.64    22.22    22.58    22.84    23.09    23.33 |
| lexical               11.49    12.83    13.23    13.42    13.49    13.52    13.56 |
| from_str               4.13     4.70     4.82     4.86     4.89     4.92     4.96 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           502.82   584.54   610.15   619.31   622.28   625.45   629.28 |
| fast-float-tokenized   285.64   376.57   386.61   393.00   397.36   401.72   405.94 |
| lexical              199.92   223.34   230.20   233.49   234.80   235.32   235.88 |
| from_str              71.87    81.71    83.85    84.58    85.07    85.58    86.29 |
|                                                                                   |
=====================================================================================
Alexhuszagh commented 3 years ago

It might be possible to use #[inline(always)] for everything, and undo this performance hit, but that seems like a bad strategy and lead to larger binary sizes (which would be suboptimal for architectures where binary size, and not speed is the primary concern, such as embedded systems).

Alexhuszagh commented 3 years ago

I got similar results on x86_64 Windows MSVC (slightly more muted, but similar). I'm almost certain this is due to compiler inlining and weird stuff, and not actually going to have any impact on real-world cases.

Master

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            23.61    24.69    26.00    27.35    29.46    33.91    49.25 |
| lexical               61.32    62.75    64.58    67.15    70.93    82.38   116.19 |
| from_str             174.15   180.41   185.68   191.25   199.58   231.57   309.24 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            20.30    29.51    33.95    36.57    38.47    40.52    42.35 |
| lexical                8.61    12.15    14.11    14.90    15.49    15.94    16.31 |
| from_str               3.23     4.32     5.01     5.23     5.39     5.54     5.74 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           353.30   513.50   590.78   636.44   669.42   705.08   737.00 |
| lexical              149.76   211.38   245.48   259.20   269.48   277.35   283.80 |
| from_str              56.27    75.16    87.20    90.99    93.73    96.47    99.92 |
|                                                                                   |
=====================================================================================

JSON Without use_tokenized

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            24.10    24.70    25.84    27.00    29.19    35.28    66.45 |
| lexical               60.45    61.91    64.47    68.22    73.44    83.75   127.01 |
| from_str             166.05   172.24   177.34   182.48   191.74   226.01   286.65 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            15.05    28.76    34.27    37.04    38.70    40.49    41.49 |
| lexical                7.87    11.96    13.62    14.66    15.51    16.15    16.54 |
| from_str               3.49     4.43     5.22     5.48     5.64     5.81     6.02 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           261.88   500.53   596.37   644.52   673.50   704.56   721.92 |
| lexical              137.01   208.07   237.06   255.09   269.93   281.10   287.85 |
| from_str              60.71    77.16    90.76    95.36    98.13   101.06   104.79 |
|                                                                                   |
=====================================================================================

JSON With use_tokenized

=====================================================================================
|                         canada.txt (111126, 1.93 MB, f64)                         |
|===================================================================================|
|                                                                                   |
| ns/float                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            24.37    25.18    26.43    27.58    29.54    34.87    54.98 |
| fast-float-tokenized    33.08    34.93    36.32    37.98    41.33    53.11    71.55 |
| lexical               65.74    67.02    69.50    72.87    78.00    91.32   128.91 |
| from_str             169.05   175.63   182.66   189.98   201.41   228.94   373.07 |
|                                                                                   |
| Mfloat/s                min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float            18.19    28.70    33.88    36.27    37.84    39.73    41.03 |
| fast-float-tokenized    13.98    18.83    24.20    26.33    27.54    28.65    30.23 |
| lexical                7.76    10.96    12.82    13.73    14.39    14.93    15.21 |
| from_str               2.68     4.38     4.97     5.26     5.48     5.70     5.92 |
|                                                                                   |
| MB/s                    min       5%      25%   median      75%      95%      max |
|-----------------------------------------------------------------------------------|
| fast-float           316.52   499.43   589.63   631.08   658.48   691.41   713.93 |
| fast-float-tokenized   243.19   327.64   421.10   458.16   479.20   498.52   526.05 |
| lexical              134.99   190.78   223.10   238.90   250.39   259.79   264.72 |
| from_str              46.64    76.17    86.46    91.60    95.29    99.11   102.93 |
|                                                                                   |
=====================================================================================

Specs are similar to before (different CPU, slightly higher clock speed):