Open spakin opened 5 months ago
What sort of application represents time duration like this?
In this case, it was MiniSat, an old but still often used C++ code. MiniSat normally doesn't output times in scientific notation, but I suspect that it uses printf("%g", …)
or similar. On one particular run, it finished in a tiny fraction of a second, output its elapsed time in scientific notation, and caused my Go wrapper to crash.
I would be opposed to adding this complexity to ParseDuration
. It already has units (a scale, kind of like an exponent, so kind of doubles up that functionality) and this change would make the values harder to read. Plus, ParseDuration
is documented clearly to parse a decimal with fraction and unit but not more. It seems like it's intended to parse what Duration.String
produces -- and that never produces exponents. So this would only be to parse values produced by other systems. My recommendation is if you have a custom format to parse, use a custom parser.
Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.
Conceptually, time.Duration is fixed-point. Allowing scientific notation may lure one into the false believe that e. g. "1e-10s" is a representable duration.
But how is that different from "0.0000000001s", which is equally non-representable?
"0.0000000001s" looks funky. "100ps" and "1e-10s" don't. The former doesn't parse.
"0.0000000001s" looks funky.
I don't follow. We're talking about parsing input, not producing output. Who cares what inputs look funky? They'll be normalized on output.
"100ps" and "1e-10s" don't. The former doesn't parse.
"0.0000000001s" parses as 0ns, as expected for an integer type with nanosecond resolution. "100ps" doesn't parse because units smaller than nanoseconds aren't recognized. This is already documented:
Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
Despite time.ParseDuration
's input accurately being documented as "a possibly signed sequence of decimal numbers, each with optional fraction, I argue that this violates the Principle of Least Surprise because why should some parts of Go (e.g., strconv.ParseFloat
and literal floating-point numbers) accept scientific notation but others not?
My proposal is simply that time.ParseDuration
accept the same values that strconv.ParseFloat
accepts but followed by one of the preceding units. The documentation would then read,
ParseDuration parses a duration string. A duration string is a floating-point number with a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
I guess my interpretation of Principle of Least Surprise differs from yours, but would you also want time.ParseDuration to accept any of "0xap1us", "0x.dp+0h" or "+InfhNANns"?
I wouldn't expect it to accept "+InfhNANns" because that can't be represented by a time.Duration
(int64
nanoseconds). The others, while unlikely ever to be used, ideally should be accepted because they're valid floating-point numbers. Again, why should some parts of Go inconsistently accept different floating-point numbers from other parts of Go?
That said, my main concern is scientific notation because programs in many languages, including Go, have an equivalent of C's printf("%g", …)
. I encountered a case in the wild of this being used to output a time, which Go can't currently parse because time.Duration
processes numbers differently from strconv.ParseFloat
when it in fact does not need to.
This feels like asking to hiding alot of magic and rounding heuristics inside time.ParseDuration(), since there are alot of floating point numbers that will not have an exact int64 nanosecondsrepresentation.
What should ParseDuration() do then? It could hide the sloppy inexact conversion, returning an approximation. Or it could return an error. What are you going to do with the error? Probably fall back on the following preferred approach (below) anyway. What are you going to do with a sloppy inexact conversion? Will you even notice it? Will not noticing it lead to bugs? Some, almost surely. Will the new heuristic be acceptable for all current users of time.ParseDuration()? Unlikely, if they are depending on the current set error producing inputs.
I would much rather read Go application code that makes explicit what is going on, like first using f := strconv.ParseFloat(s, 64) on a string s that has the 's' (for seconds duration) suffix verified and removed, and then deliberately displaying the rounding/conversion to the int64 domain. Then I can readily write tests for the cases that matter to my application, and easily tweak the behaivor to what the application at hand requires.
In other words, it is pretty easy to write in user/application code already, using strconv.ParseFloat() itself, and it would be better practice to do that.
time.ParseDuration
already uses float
s (see line 1668) and already performs inexact conversions. Consider parsing "0.0000000000005h"
, for example: https://go.dev/play/p/gvs16-UT0N8
Once again, the core of my proposal is for time.ParseDuration
to accept scientific notation so "0.0000000000005h"
would be parsed the same as "5e-13h"
. If going through strconv.ParseFloat
further reduces the precision of the conversion to int64
nanoseconds—I don't know that it would—then sacrificing its additional features may be warranted for the proposed enhancement to time.ParseDuration
.
@spakin: An addendum to your proposal.
The time.ParseDuration
function parses Go decimal numbers with an optional fraction part. The function should also parse Go decimal floating-point numbers with an exponent part. Scientific notation is widely used. JSON data interchange is very widely used. Use of the IEEE 754 Standard for Binary Floating-Point Arithmetic in hardware is near universal.
"A [Go] decimal floating-point literal consists of an integer part (decimal digits), a decimal point, a fractional part (decimal digits), and an exponent part (e or E followed by an optional sign and decimal digits). One of the integer part or the fractional part may be elided; one of the decimal point or the exponent part may be elided. An exponent value exp scales the mantissa (integer and fractional part) by 10ᵉˣᵖ."
"A [JSON] number is a sequence of decimal digits with no superfluous leading zero. It may have a preceding minus sign (U+002D). It may have a fractional part prefixed by a decimal point (U+002E). It may have an exponent, prefixed by e (U+0065) or E (U+0045) and optionally + (U+002B) or – (U+002D). The digits are the code points U+0030 through U+0039."
For example:
time.ParseDuration(s string) (d Duration, err error):
format: "%.4f"
s: "0.0000s" d: 0ns 0s err: <nil>
s: "0.5000s" d: 500000000ns 500ms err: <nil>
s: "1.0000s" d: 1000000000ns 1s err: <nil>
s: "-1.0000s" d: -1000000000ns -1s err: <nil>
s: "100.0000s" d: 100000000000ns 1m40s err: <nil>
s: "1000.0000s" d: 1000000000000ns 16m40s err: <nil>
s: "10000.0000s" d: 10000000000000ns 2h46m40s err: <nil>
s: "12345.0000s" d: 12345000000000ns 3h25m45s err: <nil>
s: "100000.0000s" d: 100000000000000ns 27h46m40s err: <nil>
s: "123456.0000s" d: 123456000000000ns 34h17m36s err: <nil>
format: "%.4e"
s: "0.0000e+00s" d: 0ns 0s err: <nil>
s: "5.0000e-01s" d: 500000000ns 500ms err: <nil>
s: "1.0000e+00s" d: 1000000000ns 1s err: <nil>
s: "-1.0000e+00s" d: -1000000000ns -1s err: <nil>
s: "1.0000e+02s" d: 100000000000ns 1m40s err: <nil>
s: "1.0000e-02s" d: 10000000ns 10ms err: <nil>
s: "1.0000e+03s" d: 1000000000000ns 16m40s err: <nil>
s: "1.0000e-03s" d: 1000000ns 1ms err: <nil>
s: "1.0000e+04s" d: 10000000000000ns 2h46m40s err: <nil>
s: "1.0000e-04s" d: 100000ns 100µs err: <nil>
s: "1.2345e+04s" d: 12345234500000ns 3h25m45.2345s err: <nil>
s: "1.2345e-04s" d: 123450ns 123.45µs err: <nil>
s: "1.0000e+05s" d: 100000000000000ns 27h46m40s err: <nil>
s: "1.0000e-05s" d: 10000ns 10µs err: <nil>
s: "1.2346e+05s" d: 123460234600000ns 34h17m40.2346s err: <nil>
s: "1.2346e-05s" d: 12346ns 12.346µs err: <nil>
format: "%.4g"
s: "0s" d: 0ns 0s err: <nil>
s: "0.5s" d: 500000000ns 500ms err: <nil>
s: "1s" d: 1000000000ns 1s err: <nil>
s: "-1s" d: -1000000000ns -1s err: <nil>
s: "100s" d: 100000000000ns 1m40s err: <nil>
s: "1000s" d: 1000000000000ns 16m40s err: <nil>
s: "1e+04s" d: 10000000000000ns 2h46m40s err: <nil>
s: "1e-04s" d: 100000ns 100µs err: <nil>
s: "1.234e+04s" d: 12340234000000ns 3h25m40.234s err: <nil>
s: "1.234e-04s" d: 123400ns 123.4µs err: <nil>
s: "1e+05s" d: 100000000000000ns 27h46m40s err: <nil>
s: "1e-05s" d: 10000ns 10µs err: <nil>
s: "1.235e+05s" d: 123500235000000ns 34h18m20.235s err: <nil>
s: "1.235e-05s" d: 12350ns 12.35µs err: <nil>
Format examples:
The GNU C Library Reference Manual
12.12.5 Floating-Point Conversions
The strconv.ParseFloat
function is not a solution. The time.Duration
type is an integer type.
Adapting the current time.Duration
functions to detect and apply an optional exponent is simple, easy, accurate, and readable.
All current time.Duration
tests pass.
Test results for decimal floating-point numbers are shown in the C library floating-point examples.
Test results for @spakin's Go Playground time.ParseDuration
examples with scientific notation allowed:
First number is 12.34.
First duration is 12.34s.
Second number is 0.0001234.
Second duration is 123.4µs.
and
d: 2.777777777777778e-13h s: "0.0000000000005h"
d: 2.777777777777778e-13h s: "5e-13h"
Sounds good to me, @peterGo. Thanks for the work you put into testing time.Duration
with exponents.
I don't buy the JSON argument: if you append a unit, what you have is no longer a JSON number, but a string with semantics not covered by JSON. E. g. {"seconds": 3e8}
I understand. But I wholeheartedly question the prevelance of e. g. {"elapsed": "3e8s"}
. And e. g. the mixed form {"elapsed": "8e4h1.2e7s"}
I find ludicrous.
This isn't about implementability. If I'm allowed a historical example, to this day people moan about "0"-prefixed octal integer literals. Standardizing parsers with obtuse behaviours, if at the time of implementing auspicious, increase complexity forever with ever diminishing value. I cannot read e. g. 1e+2h3E4m5e67.89e-0us
and I wouldn't expect the standard library to parse it, either. Niche use cases require extraordinary parsing and an increased LOC count reflects such particularities.
Proposal Details
Issue
I encountered this issue while parsing the output of an application I was working with:
strconv.ParseFloat
accepts numbers expressed in scientific notation.strconv.ParseFloat("3.336e-6", 64)
, for example, works as expected.time.ParseDuration
does not accept numbers expressed in scientific notation.time.ParseDuration("3.336e-6s")
, for example, fails withunknown unit "e-" in duration "3.336e-6s"
.A Go Playground demonstration of this limitation in
time.ParseDuration
can be found here: https://go.dev/play/p/G-1FveHxpZ3Proposal
I propose that
time.ParseDuration
be enhanced to allow durations expressed using scientific notation.