HewlettPackard / lustre_exporter

Prometheus exporter for use with the Lustre parallel filesystem
Apache License 2.0
36 stars 51 forks source link

Update float parsing patterns #83

Closed roclark closed 7 years ago

roclark commented 7 years ago

Previously, the regex matcher intended to match floats, but would instead match any character between two numbers as opposed to a period. For example, the numbers '1965 1966' would be parsed as a single string ('1965 1966') instead of two strings ('1965' '1966'). This update now properly parses both floats and integers.

Signed-Off-By: Robert Clark robert.d.clark@hpe.com

knweiss commented 7 years ago

@roclark I just noticed this commit and have some late feedback if you don't mind:

  1. FWIW: The Go package govalidator offers validators and sanitizers for strings, structs and collections. There's e.g. an IsFloat() function which uses this regex:
    Float          string = "^(?:[-+]?(?:[0-9]+))?(?:\\.[0-9]*)?(?:[eE][\\+\\-]?(?:[0-9]+))?$"
  2. If you use raw string literals (with back quotes) for the regular expression you don't need to escape the backslash.
  3. Wouldn't it be safer and more readable to define the float regex only once and use this single definition in the test case, too?
  4. Then you could also regexp.MustCompile() it once as a constant. Right now you (re-)compile every regular expressions in each call to regexCaptureStrings(). This compilation is a rather expensive operation (and maybe even involves a memory allocation).
roclark commented 7 years ago

Thanks for the feedback! I'm always open to performance improvements and other optimizations. I might incorporate this one after #82 gets pulled in as I move common functions/statements into a separate file. That way, we can cleanly define the number regex globally.

knweiss commented 7 years ago

@roclark To give you a number: I see ~4600 mallocs per second from the lustre-exporter on our OSSes according to the go_memstats_mallocs_total metric. And ~780 mallocs per second on the MDS.