kowainik / tomland

🏝 Bidirectional TOML serialization
https://kowainik.github.io/posts/2019-01-14-tomland
Mozilla Public License 2.0
121 stars 39 forks source link

More flexibility around parsing / traversing keys #348

Open AlistairB opened 3 years ago

AlistairB commented 3 years ago

Hi, my use case is I am interested in parsing all the dependencies from a rust cargofile. For example, dependencies can be in the following forms:

[dependencies]
time = "0.1.12"

[target.'cfg(windows)'.dependencies]
winhttp = "0.4.0"

[target.'cfg(unix)'.dependencies]
openssl = "1.0.1"

[target.bar.dependencies]
winhttp = "0.4.0"

# include 'awesome' package with configured features.
[dependencies.awesome]
version = "1.3.5"
features = ["secure-password", "civet"]

dependencies is quite easy, but I don't know how to parse the other cases in tomland. For the target dependencies what I really want to match on is target.*.dependencies. I think I could do this with tableMap if this was nested by working with a Map, but at the top level I don't know how to access it. So I suppose one possible solution is a way to use the existing plumbing but at the root level somehow?

Another option might be some kind of flexibility around key matching. For example you might have:

tableConditionalMap
    :: forall k v
    .  Ord k
    => TomlBiMap Key k
    -> (Key -> TomlCodec v)
    -> (Key -> Bool) -- keys are only processed if they produce a True
    -> TomlCodec [Map k v]
chshersh commented 3 years ago

Hi @AlistairB! Sorry for not returning back to you earlier. We are kinda busy and lacking the free time to maintain everything...

It's great that you provide a real use case! Could you also specify, to what exact Haskell data structure you want to parse this TOML config and what parts of it you want to include? Maybe with the value example in some half-Haskell pseudocode 🙂 After seeing the data structure, we will be able to think on the best way to implement the codec 🤔

AlistairB commented 3 years ago

No worries. Love your work :smiley:

My goal is simply to capture all the dependency names, ie. [Text]

[dependencies]
time = "0.1.12"  # I parse this to "time". This case is working fine.

[target.'cfg(windows)'.dependencies]
winhttp = "0.4.0" # similarly I want to parse "winhttp". Note the 'cfg(windows)' part is free text and could be anything.

[dependencies.awesome] # in this case the dependency name appears in the key as "awesome" which is what I want to parse
version = "1.3.5"
features = ["secure-password", "civet"]

I'm not sure if tomland has a good way currently to handle the not fully known keys at the root level. Or perhaps there is some solution I am missing. If it is lacking some feature I'd be happy to attempt a PR.

AlistairB commented 3 years ago

As another example, poetry has an expanded dependency style.

[tool.poetry.dev-dependencies.black]
version = "19.10b0"

[tool.poetry.dev-dependencies.other-one]
allow-prereleases = true

Where the interesting information I care about is black and other-one which is the name of the dependency. I think understand the problem a bit better now. The key issues seems to be that tomland doesn't have flexible ways to traverse / parse the table structure. You can only traverse to specific known nodes?

I guess I want something like the following, roughly sketched:

tableKeyMap :: forall a. Key -> (Key -> TomlCodec (Maybe v)) -> TomlCodec (Map k v)
tableKeyMap = undefined

-- used for the above example, I would then use dimap / dimatch to just produce a `TomelCodec [String]` with the map values as dependency names

expandedDevDependencies :: TomlCodec (Map String String)
expandedDevDependencies =
  tableKeyMap "tool.poetry.dev-dependencies"
    \case
       (Piece theName :| []) -> pure $ Just theName
       _ -> pure Nothing

If any of this is making sense, I'm happy to try a PR with something like tableKeyMap :sweat_smile:

CGenie commented 1 week ago

I arrived at a similar problem I think.

I think the idea of TOML is to have a readable format for writing data. As such, a file like (taken from tomland example)

server.port        = 8080
server.codes       = [ 5, 10, 42 ]

[mail]
    host = "smtp.gmail.com"
    send-if-inactive = false

is, in JSON terms:

{
  "mail": {
    "host": "smtp.gmail.com",
    "send-if-inactive": false
  },
  "server": {
    "codes": [
      5,
      10,
      42
    ],
    "port": 8080
  }
}

As such, whether server.port is a key or server is a map becomes blurred here. I could have written:

[server]
port = 8080
codes = [5, 10, 42]

and still get the same underlying JSON structure.

However, tomland example still sticks to explicitly naming the key/table sections:

settingsCodec :: TomlCodec Settings
settingsCodec = Settings
    <$> Toml.diwrap (Toml.int  "server.port")       .= settingsPort
    <*> Toml.arrayOf Toml._Int "server.codes"       .= settingsCodes
    <*> Toml.table mailCodec   "mail"               .= settingsMail

I wish I could write:

Toml.table serverCodec "server"

Another thing is that, for the case of configuring a program, one doesn't need the "write TOML" part and the .= operator could be omitted. I'm not sure if tomland can do it as well. This is also probably the reason why you can't easily switch server.port/server.codec into Toml.table "server" because you wouldn't have write . read = identity (though, sematically, they would be equivalent).

CGenie commented 1 week ago

Compare the Go example: https://github.com/BurntSushi/toml/tree/v1.4.0/_example

They basically have:

type example struct {
...
        Servers    map[string]server}
    }
...
var config example
meta, err := toml.DecodeFile(f, &config)

(i.e. they declare Servers as a map from string to server type).

The toml itself has:

[servers.alpha]
    # You can indent as you please, tabs or spaces.
    ip        = '10.0.0.1'
    hostname  = 'server1'
    enabled   = false

This would parse in the same way if they had servers.alpha = { ip, hostname, enabled } (compare with distros key in the same go example).

This isn't the case for tomland it seems.

CGenie commented 1 week ago

https://hackage.haskell.org/package/tomland-1.3.3.3/docs/src/Toml.Codec.Combinator.Common.html#match

I guess we could replace this with something more fancy than HashMap.lookup key (tomlPairs toml). :)

CGenie commented 1 week ago

From the docs: https://toml.io/en/v1.0.0#objectives

TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics. TOML is designed to map unambiguously to a hash table. TOML should be easy to parse into data structures in a wide variety of languages.

CGenie commented 1 week ago

An example of what https://hackage.haskell.org/package/toml-parser does:

prettyToml $ head $ rights [parse "x = {a = 1}"]
--- x.a = 1

prettyToml $ head $ rights [parse "[x]\na = 1"]
--- x.a = 1

prettyToml $ head $ rights [parse "[x]\na = 1\nb = {c = 1}"]
{-
[x]
a = 1
b.c = 1
-}

So while the file contents is not preserved exactly, the structure itself is.