spakin commented 5 months ago

Proposal Details

Issue

I encountered this issue while parsing the output of an application I was working with:

strconv.ParseFloat accepts numbers expressed in scientific notation. strconv.ParseFloat("3.336e-6", 64), for example, works as expected.
time.ParseDuration does not accept numbers expressed in scientific notation. time.ParseDuration("3.336e-6s"), for example, fails with unknown unit "e-" in duration "3.336e-6s".

A Go Playground demonstration of this limitation in time.ParseDuration can be found here: https://go.dev/play/p/G-1FveHxpZ3

Proposal

I propose that time.ParseDuration be enhanced to allow durations expressed using scientific notation.

seankhliao commented 5 months ago

What sort of application represents time duration like this?

spakin commented 5 months ago

In this case, it was MiniSat, an old but still often used C++ code. MiniSat normally doesn't output times in scientific notation, but I suspect that it uses printf("%g", …) or similar. On one particular run, it finished in a tiny fraction of a second, output its elapsed time in scientific notation, and caused my Go wrapper to crash.

benhoyt commented 5 months ago

I would be opposed to adding this complexity to ParseDuration. It already has units (a scale, kind of like an exponent, so kind of doubles up that functionality) and this change would make the values harder to read. Plus, ParseDuration is documented clearly to parse a decimal with fraction and unit but not more. It seems like it's intended to parse what Duration.String produces -- and that never produces exponents. So this would only be to parse values produced by other systems. My recommendation is if you have a custom format to parse, use a custom parser.

jfrech commented 5 months ago

Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.

spakin commented 5 months ago

Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.

But how is that different from "0.0000000001s", which is equally non-representable?

jfrech commented 5 months ago

"0.0000000001s" looks funky. "100ps" and "1e-10s" don't. The former doesn't parse.

spakin commented 5 months ago

"0.0000000001s" looks funky.

I don't follow. We're talking about parsing input, not producing output. Who cares what inputs look funky? They'll be normalized on output.

"100ps" and "1e-10s" don't. The former doesn't parse.

"0.0000000001s" parses as 0ns, as expected for an integer type with nanosecond resolution. "100ps" doesn't parse because units smaller than nanoseconds aren't recognized. This is already documented:

Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".

Despite time.ParseDuration's input accurately being documented as "a possibly signed sequence of decimal numbers, each with optional fraction, I argue that this violates the Principle of Least Surprise because why should some parts of Go (e.g., strconv.ParseFloat and literal floating-point numbers) accept scientific notation but others not?

My proposal is simply that time.ParseDuration accept the same values that strconv.ParseFloat accepts but followed by one of the preceding units. The documentation would then read,

ParseDuration parses a duration string. A duration string is a floating-point number with a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".

jfrech commented 5 months ago

I guess my interpretation of Principle of Least Surprise differs from yours, but would you also want time.ParseDuration to accept any of "0xap1us", "0x.dp+0h" or "+InfhNANns"?

spakin commented 5 months ago

I wouldn't expect it to accept "+InfhNANns" because that can't be represented by a time.Duration (int64 nanoseconds). The others, while unlikely ever to be used, ideally should be accepted because they're valid floating-point numbers. Again, why should some parts of Go inconsistently accept different floating-point numbers from other parts of Go?

That said, my main concern is scientific notation because programs in many languages, including Go, have an equivalent of C's printf("%g", …). I encountered a case in the wild of this being used to output a time, which Go can't currently parse because time.Duration processes numbers differently from strconv.ParseFloat when it in fact does not need to.

glycerine commented 5 months ago

This feels like asking to hiding alot of magic and rounding heuristics inside time.ParseDuration(), since there are alot of floating point numbers that will not have an exact int64 nanosecondsrepresentation.

What should ParseDuration() do then? It could hide the sloppy inexact conversion, returning an approximation. Or it could return an error. What are you going to do with the error? Probably fall back on the following preferred approach (below) anyway. What are you going to do with a sloppy inexact conversion? Will you even notice it? Will not noticing it lead to bugs? Some, almost surely. Will the new heuristic be acceptable for all current users of time.ParseDuration()? Unlikely, if they are depending on the current set error producing inputs.

I would much rather read Go application code that makes explicit what is going on, like first using f := strconv.ParseFloat(s, 64) on a string s that has the 's' (for seconds duration) suffix verified and removed, and then deliberately displaying the rounding/conversion to the int64 domain. Then I can readily write tests for the cases that matter to my application, and easily tweak the behaivor to what the application at hand requires.

In other words, it is pretty easy to write in user/application code already, using strconv.ParseFloat() itself, and it would be better practice to do that.

spakin commented 5 months ago

time.ParseDuration already uses floats (see line 1668) and already performs inexact conversions. Consider parsing "0.0000000000005h", for example: https://go.dev/play/p/gvs16-UT0N8

Once again, the core of my proposal is for time.ParseDuration to accept scientific notation so "0.0000000000005h" would be parsed the same as "5e-13h". If going through strconv.ParseFloat further reduces the precision of the conversion to int64 nanoseconds—I don't know that it would—then sacrificing its additional features may be warranted for the proposed enhancement to time.ParseDuration.

peterGo commented 4 months ago

@spakin: An addendum to your proposal.

The time.ParseDuration function parses Go decimal numbers with an optional fraction part. The function should also parse Go decimal floating-point numbers with an exponent part. Scientific notation is widely used. JSON data interchange is very widely used. Use of the IEEE 754 Standard for Binary Floating-Point Arithmetic in hardware is near universal.

"A [Go] decimal floating-point literal consists of an integer part (decimal digits), a decimal point, a fractional part (decimal digits), and an exponent part (e or E followed by an optional sign and decimal digits). One of the integer part or the fractional part may be elided; one of the decimal point or the exponent part may be elided. An exponent value exp scales the mantissa (integer and fractional part) by 10ᵉˣᵖ."

"A [JSON] number is a sequence of decimal digits with no superfluous leading zero. It may have a preceding minus sign (U+002D). It may have a fractional part prefixed by a decimal point (U+002E). It may have an exponent, prefixed by e (U+0065) or E (U+0045) and optionally + (U+002B) or – (U+002D). The digits are the code points U+0030 through U+0039."

For example:

time.ParseDuration(s string) (d Duration, err error):

format: "%.4f"
s: "0.0000s"       d: 0ns 0s  err: <nil>
s: "0.5000s"       d: 500000000ns 500ms  err: <nil>
s: "1.0000s"       d: 1000000000ns 1s  err: <nil>
s: "-1.0000s"      d: -1000000000ns -1s  err: <nil>
s: "100.0000s"     d: 100000000000ns 1m40s  err: <nil>
s: "1000.0000s"    d: 1000000000000ns 16m40s  err: <nil>
s: "10000.0000s"   d: 10000000000000ns 2h46m40s  err: <nil>
s: "12345.0000s"   d: 12345000000000ns 3h25m45s  err: <nil>
s: "100000.0000s"  d: 100000000000000ns 27h46m40s  err: <nil>
s: "123456.0000s"  d: 123456000000000ns 34h17m36s  err: <nil>

format: "%.4e"
s: "0.0000e+00s"   d: 0ns 0s  err: <nil>
s: "5.0000e-01s"   d: 500000000ns 500ms  err: <nil>
s: "1.0000e+00s"   d: 1000000000ns 1s  err: <nil>
s: "-1.0000e+00s"  d: -1000000000ns -1s  err: <nil>
s: "1.0000e+02s"   d: 100000000000ns 1m40s  err: <nil>
s: "1.0000e-02s"   d: 10000000ns 10ms  err: <nil>
s: "1.0000e+03s"   d: 1000000000000ns 16m40s  err: <nil>
s: "1.0000e-03s"   d: 1000000ns 1ms  err: <nil>
s: "1.0000e+04s"   d: 10000000000000ns 2h46m40s  err: <nil>
s: "1.0000e-04s"   d: 100000ns 100µs  err: <nil>
s: "1.2345e+04s"   d: 12345234500000ns 3h25m45.2345s  err: <nil>
s: "1.2345e-04s"   d: 123450ns 123.45µs  err: <nil>
s: "1.0000e+05s"   d: 100000000000000ns 27h46m40s  err: <nil>
s: "1.0000e-05s"   d: 10000ns 10µs  err: <nil>
s: "1.2346e+05s"   d: 123460234600000ns 34h17m40.2346s  err: <nil>
s: "1.2346e-05s"   d: 12346ns 12.346µs  err: <nil>

format: "%.4g"
s: "0s"            d: 0ns 0s  err: <nil>
s: "0.5s"          d: 500000000ns 500ms  err: <nil>
s: "1s"            d: 1000000000ns 1s  err: <nil>
s: "-1s"           d: -1000000000ns -1s  err: <nil>
s: "100s"          d: 100000000000ns 1m40s  err: <nil>
s: "1000s"         d: 1000000000000ns 16m40s  err: <nil>
s: "1e+04s"        d: 10000000000000ns 2h46m40s  err: <nil>
s: "1e-04s"        d: 100000ns 100µs  err: <nil>
s: "1.234e+04s"    d: 12340234000000ns 3h25m40.234s  err: <nil>
s: "1.234e-04s"    d: 123400ns 123.4µs  err: <nil>
s: "1e+05s"        d: 100000000000000ns 27h46m40s  err: <nil>
s: "1e-05s"        d: 10000ns 10µs  err: <nil>
s: "1.235e+05s"    d: 123500235000000ns 34h18m20.235s  err: <nil>
s: "1.235e-05s"    d: 12350ns 12.35µs  err: <nil>

Format examples:
The GNU C Library Reference Manual
12.12.5 Floating-Point Conversions

The strconv.ParseFloat function is not a solution. The time.Duration type is an integer type.

Adapting the current time.Duration functions to detect and apply an optional exponent is simple, easy, accurate, and readable.

All current time.Duration tests pass.

Test results for decimal floating-point numbers are shown in the C library floating-point examples.

Test results for @spakin's Go Playground time.ParseDuration examples with scientific notation allowed:

First number is 12.34.
First duration is 12.34s.
Second number is 0.0001234.
Second duration is 123.4µs.

and

d: 2.777777777777778e-13h s: "0.0000000000005h"
d: 2.777777777777778e-13h s: "5e-13h"

spakin commented 4 months ago

Sounds good to me, @peterGo. Thanks for the work you put into testing time.Duration with exponents.

jfrech commented 4 months ago

I don't buy the JSON argument: if you append a unit, what you have is no longer a JSON number, but a string with semantics not covered by JSON. E. g. {"seconds": 3e8} I understand. But I wholeheartedly question the prevelance of e. g. {"elapsed": "3e8s"}. And e. g. the mixed form {"elapsed": "8e4h1.2e7s"} I find ludicrous.

This isn't about implementability. If I'm allowed a historical example, to this day people moan about "0"-prefixed octal integer literals. Standardizing parsers with obtuse behaviours, if at the time of implementing auspicious, increase complexity forever with ever diminishing value. I cannot read e. g. 1e+2h3E4m5e67.89e-0us and I wouldn't expect the standard library to parse it, either. Niche use cases require extraordinary parsing and an increased LOC count reflects such particularities.

golang / go

proposal: time: allow scientific notation in ParseDuration #67076

Proposal Details

Issue

Proposal