aldanor / fast-float-rust

Super-fast float parser in Rust (now part of Rust core)
https://docs.rs/fast-float
Apache License 2.0
275 stars 20 forks source link

[Feature Request] Support JSON Numbers #17

Open Alexhuszagh opened 3 years ago

Alexhuszagh commented 3 years ago

One of the major motivations for lexical-core's recent developments has been support for parsing floats of different formats, most notably JSON numbers.

A few notable differences exist in comparison to Rust floats, or those in other languages. For example, with rust literals and rust strings. Providing a function of the signature fn parse_tokenized(integral: &[u8], fractional: &[u8], exponent: i32, negative: bool); would therefore allow users to validate their own float formats, while then letting fast-float-rust do the majority of the heavy lifting. It would also not accept special floats.

This should require minimal changes in the parsing implementation, while making the library much more suitable for general-purpose applications.

"NaN"       // valid
"nan"       // invalid
"1.23"      // valid
"1.23e"     // invalid
"1."        // valid
".1"        // valid
"1.23e5"    // valid
"+1.23e5"   // valid
"-1.23e5"   // valid

Meanwhile, in JSON, we get the following:

"NaN"       // invalid
"nan"       // invalid
"1.23"      // valid
"1.23e"     // invalid
"1."        // invalid
".1"        // invalid
"1.23e5"    // valid
"+1.23e5"   // invalid
"-1.23e5"   // valid

This can extend to various markup languages, like TOML, YAML (which has the same rules as JSON), XML, and others.

lemire commented 3 years ago

Is -1.23e5 ever invalid?

Alexhuszagh commented 3 years ago

@lemire Thanks for catching my typo....

aldanor commented 3 years ago

@Alexhuszagh This makes sense, agreed. However: if we're planning to try and integrate this crate into core/std (in which case this crate would become redundant), what would happen to extra functionality like this?

Just thinking - ideally, this kind of stuff should be provided in std as well? (e.g., "f64::from_parts(...)").

Alexhuszagh commented 3 years ago

@aldanor It would be removed from the Rust version, unless we have an RFC to add it. I can proposed a pre-RFC for this, however, on Rust-internals now.

I'm also doing the commits for this right now (it's very rudimentary, but it separates the functionality. The other advantage this has is it would provide easy integration into serde-json, which currently uses a fork of my library, lexical. Code gets faster, I get to pass on some experience I have on features that are crucial for some applications. Everyone wins.

It would also provide a great raison-d'être for this library until the RFC is approved.

Alexhuszagh commented 3 years ago

I've added a pre-RFC for this on Rust internals.