status-im / nim-toml-serialization

Flexible TOML serialization [not] relying on run-time type information.
Apache License 2.0
37 stars 7 forks source link
configuration-file nim parser serialization serializer toml

nim-toml-serialization

License: MIT License: Apache TOML Stability: experimental nimble Github action

Flexible TOML serialization [not] relying on run-time type information.

Table of Contents

Overview

nim-toml-serialization is a member of nim-serialization family and provides several operation modes:

Note
On Windows, you might need to increase the stack size as nim-toml-serialization uses the stack to pass the object around. Example: add --passL:"-Wl,--stack,8388608" to your command line when running the compiler. But you only need to do this if the object you serializing can produce deep recursion.

Spec compliance

nim-toml-serialization implements v1.0.0 TOML spec and pass these test suites:

Nonstandard features

Keyed mode

When decoding, only objects, tuples or TomlValueRef are allowed at top level. All other Nim basic datatypes such as floats, ints, arrays, and booleans must be a value of a key.

nim-toml-serialization offers keyed mode decoding to overcome this limitation. The parser can skip any non-matching key-value pair efficiently because the parser produces no token but at the same time can validate the syntax correctly.

[server]
  name = "TOML Server"
  port = 8005
var x = Toml.decode(rawToml, string, "server.name")
assert x == "TOML Server"

or

var y = Toml.decode(rawToml, string, "server.name", caseSensitivity)

where caseSensitivity is one of:

The key must be a valid Toml basic key, quoted key, or dotted key.

Gotcha:

server = { ip = "127.0.0.1", port = 8005, name = "TOML Server" }

It may be tempting to use keyed mode for the above example like this:

var x = Toml.decode(rawToml, string, "server.name")

But it won't work because the grammar of TOML makes it very difficult to exit from the inline table parser in a clean way.

Decoder

  type
    NimServer = object
      name: string
      port: int

    MixedServer = object
      name: TomlValueRef
      port: int

    StringServer = object
      name: string
      port: string

  # decode into native Nim
  var nim_native = Toml.decode(rawtoml, NimServer)

  # decode into mixed Nim + TomlValueRef
  var nim_mixed = Toml.decode(rawtoml, MixedServer)

  # decode any value into string
  var nim_string = Toml.decode(rawtoml, StringServer)

  # decode any valid TOML
  var toml_value = Toml.decode(rawtoml, TomlValueRef)

Parse inline table with newline

# This is a nonstandard toml

server = {
  ip = "127.0.0.1",
  port = 8005,
  name = "TOML Server"
}
  # turn on newline in inline table mode
  var x = Toml.decode(rawtoml, Server, flags = {TomlInlineTableNewline})

Load and save

  var server = Toml.loadFile("filename.toml", Server)
  var ip = Toml.loadFile("filename.toml", string, "server.ip")

  Toml.saveFile("filename.toml", server)
  Toml.saveFile("filename.toml", ip, "server.ip")
  Toml.saveFile("filename.toml", server, flags = {TomlInlineTableNewline})

TOML we can['t] do

Option[T]

Option[T] works as usual.

Bignum

TOML integer maxed at int64. But nim-toml-serialization can extend this to arbitrary precision bignum. Parsing bignum is achieved via the helper function parseNumber.

# This is an example of how to parse bignum with `parseNumber` and `stint`.

import stint, toml_serialization

proc readValue*(r: var TomlReader, value: var Uint256) =
  try:
    var z: string
    let (sign, base) = r.parseNumber(z)

    if sign == Sign.Neg:
      raiseTomlErr(r.lex, errNegateUint)

    case base
    of base10: value = parse(z, Uint256, 10)
    of base16: value = parse(z, Uint256, 16)
    of base8:  value = parse(z, Uint256, 8)
    of base2:  value = parse(z, Uint256, 2)
  except ValueError as ex:
    raiseUnexpectedValue(r.lex, ex.msg)

var z = Toml.decode("bignum = 1234567890_1234567890", Uint256, "bignum")
assert $z == "12345678901234567890"

Table

Decoding a table can be achieved via the parseTable template. To parse the value, you can use one of the helper functions or use readValue.

The table can be used to parse the top-level value, regular table, and inline table like an object.

No built-in readValue for the table provided, you must overload it yourself depending on your need.

Table can be stdlib table, ordered table, table ref, or any table-like data type.

proc readValue*(r: var TomlReader, table: var Table[string, int]) =
  parseTable(r, key):
    table[key] = r.parseInt(int)

Sets and list-like

Similar to Table, sets and list or array-like data structure can be parsed using parseList template. It comes in two flavors, indexed and non-indexed.

Built-in readValue for regular seq and array is implemented for you. No built-in readValue for set or set-like is provided, you must overload it yourself depending on your need.

type
  HoldArray = object
    data: array[3, int]

  HoldSeq = object
    data: seq[int]

  WelderFlag = enum
    TIG
    MIG
    MMA

  Welder = object
    flags: set[WelderFlag]

proc readValue*(r: var TomlReader, value: var HoldArray) =
  # parseList with index, `i` can be any valid identifier
  r.parseList(i):
    value.data[i] = r.parseInt(int)

proc readValue*(r: var TomlReader, value: var HoldSeq) =
  # parseList without index
  r.parseList:
    let lastPos = value.data.len
    value.data.setLen(lastPos + 1)
    readValue(r, value.data[lastPos])

proc readValue*(r: var TomlReader, value: var Welder) =
  # populating set also okay
  r.parseList:
    value.flags.incl r.parseEnum(WelderFlag)

Enums

There are no enums in TOML specification. The reader/decoder can parse both the ordinal or string representation of an enum. While on the other hand, the writer/encoder only has an ordinal built-in writer. But that is not a limitation, you can always overload the writeValue to produce whatever representation of the enum you need.

The ordinal representation of an enum is TOML integer. The string representation is TOML basic string or literal string. Both multi-line basic strings(e.g. """TOML""") and multi-line literal strings(e.g. '''TOML''') are not allowed for enum value.

# fruits.toml
fruit1 = "Apple"   # basic string
fruit2 = 1         # ordinal value
fruit3 = 'Orange'  # literal string
type
  Fruits = enum
    Apple
    Banana
    Orange

  FruitBasket = object
    fruit1: Fruits
    fruit2: Fruits
    fruit3: Fruits

var x = Toml.loadFile("fruits.toml", FruitBasket)
assert x.fruit1 == Apple
assert x.fruit2 == Banana
assert x.fruit3 == Orange

# write enum output as a string
proc writeValue*(w: var TomlWriter, val: Fruits) =
  w.writeValue $val

let z = FruitBasket(fruit1: Apple, fruit2: Banana, fruit3: Orange)
let res = Toml.encode(z)
assert res == "fruit1 = \"Apple\"\nfruit2 = \"Banana\"\nfruit3 = \"Orange\"\n"

You can control the reader behavior when deserializing specific enum using configureTomlDeserialization.

configureTomlDeserialization(
    T: type[enum], allowNumericRepr: static[bool] = false,
    stringNormalizer: static[proc(s: string): string] = strictNormalize)

Helper functions

parseAsString can parse any valid TOML value into a Nim string including a mixed array or inline table.

parseString returns a tuple:

Sign can be one of:

Implementation specifics

TomlTime contains a subsecond field. The spec says the precision is implementation-specific.

In nim-toml-serialization the default is 6 digits precision. Longer precision will be truncated by the parser.

You can override this using compiler switch -d:tomlSubsecondPrecision=numDigits.

Installation

You can install the development version of the library through Nimble with the following command

nimble install https://github.com/status-im/nim-toml-serialization@#master

or install the latest release version

nimble install toml_serialization

License

Licensed and distributed under either of

or

at your option. This file may not be copied, modified, or distributed except according to those terms.

Credits

A portion of the toml decoder was taken from PMunch's parsetoml