Closed abonander closed 3 months ago
Making the quoting significant makes this no longer TOML but a custom format that looks like TOML. As the use case is something we intentionally do not want to support, I think we'll pass on this feature.
That's disappointing, I was hoping for at least some back-and-forth.
It wouldn't have to be in toml
, it could be in toml_edit
, the whole point of which, from my perspective, is to treat all syntax as significant.
But you're the boss.
This could also be used, like that in serde_json
, to extract a subsection of a TOML file and re-serialized it verbatim. I could see that as being very useful.
@epage could you at least please answer this question?
is Spanned support even part of toml/toml_edit's SemVer guarantees?
Yes, it is
Background
I'm creating a TOML config parser for SQLx to override various aspects of the library's operation, namely the query macros, as well as migrations. (https://github.com/launchbadge/sqlx/pull/3383)
One big need is to provide a way to override or augment the default SQL -> Rust type mappings in the macros, so I'm adding a section like this to the TOML format:
This is going to be most useful with PostgreSQL, which has a rich type system supporting user-created types (as well as a dynamically loaded extension system which can also add types).
Making it easier to use custom types is one of the main goals of this feature.
Type names in Postgres can be schema-qualified, and to support this in TOML, I'm just allowing type names to be arbitrary paths (using a type with a custom
Deserialize
implementation to read nested tables):This would also make it easier to define multiple types for the same schema, as the user can just create a new table:
Problem
In Postgres, both the name of the type as well as the name of the schema can be quoted identifiers, which are case-sensitive whereas normal unquoted identifiers are not. Thus, it is important to distinguish between the two.
From my experience, I know with some certainty that a typical user's first instinct will be to just do this:
But this will not result in the expected behavior because
toml
will parse out the quotes, and SQLx won't know the difference.This is an issue with or without a schema qualifier:
Proposed Solution
I think the most versatile solution would be to add a type analogous to
serde_json::value::RawValue
.When parsing, I could use something like this to get the raw key, then decide whether to parse it or use it raw.
A simpler solution would be to just add a specific analogue just for keys, maybe implementing
Deserialize
fortoml_edit::Key
itself. That would satisfy my use-case but little else.I'm willing to put some work into a PR if we agree on a solution here.
Alternatives
Fix It in ~Post~ Documentation
This could easily be written off as a simple documentation issue on SQLx's part. If the key is wrapped in a second pair of quotes, SQLx will see the inner quotes:
This is what I'm going with for now so I'm not blocked on this proposal. However, it's not a satisfying solution on its own because there's always users who don't fully read documentation and just go with their gut, then open an issue when things don't work as expected. I'd prefer SQLx to be able to handle this intelligently.
Spanned
While reading the source, I did notice the support for
serde_spanned::Spanned
which would theoretically allow me to implement this myself.I could deserialize a key with
Spanned<IgnoredAny>
, then go back to the TOML string and get the raw key from the given span.The problem is, how do I thread the TOML string that deep in the call tree?
DeserializeSeed
?#[derive(serde::Deserialize)]
with manual implementations, which would make it more cumbersome for maintenance and future additions.Also, is
Spanned
support even part oftoml
/toml_edit
's SemVer guarantees?Deserialize using the parsed TOML structure
This is a non-starter for the same reasons
DeserializeSeed
is.Lint the TOML or Two-Pass Deserializtion
These are basically the same thing and have mostly the same problems.
Alongside the documentation alternative, I could lint for improperly quoted names in the TOML by parsing to a
toml_edit::DocumentMut
, inspecting the structure and then deserializing the config structure if I don't find any problems.Alternatively, I could parse to a
toml_edit::DocumentMut
, deserialize the config structure, then traverse the TOML structure and go back and fix-up quoted names.Either way, it requires duplicating some of the work that deserialization is doing and having two different places in the code where semantics are defined, which is not great for maintainability.