Closed csharpfritz closed 3 years ago
Personally, would expect this to be an error. If the intention is to have value1, value2, and value3 as an array of values for the 'key1' key then I would expect to see them equally aligned indented deeper than key1.
As I see it this is a question of do we allow these examples to be the same:
root
value1
value2
root
value1
value2
root
value1
value2
If we don't allow that your example should be an error. If that is not an error than it is very complicated and difficult to parse and grasp what value is just an entry in an array and what is just a simple value in the root.
As @Stelzi79 shows, I would expect ONLY the first example to be valid. What benefit would there be for indented values further, such as 'value2' in the middle example? Example 3 is just an exaggerated version of the middle example.
I had some more thoughts.
IMO the issues, why this example document is confusing is that there is
key1
and then the value of that key is value1
.Not allowing inconsistent indentation would make this document an invalid one. If we allow that the only meaningful interpretation of it would be that it is root containing 3 values
Everything else would mean some special and magic parsing rules that might depend on linguistic context for keys where they have different meaning when they are named a particular way.
Just been trying to convert a github action YAML File to TAML and come across the following:
steps:
- uses: actions/checkout@v2
- name: Build
run: cargo build --verbose
- name: Run tests
run: cargo test --verbose
Where steps is an array of arrays and can't seem to work out how you would represent this in TAML. Would something likethe following work:
steps
uses: actions/checkout@v2
name: Build
run: cargo build --verbose
name: Run tests
run: cargo test --verbose
However, I think that intrducing block scopes like this could be a confusing move. The following suggestion makes use of #
as a delimiting character and may not be in the TAML spirit:
steps
#
uses: actions/checkout@v2
#
name: Build
run: cargo build --verbose
#
name: Run tests
run: cargo test --verbose
Well let me try:
You have steps
that uses
the actions/checkout@v2
thng (container, image, etc.). First there is the Build
step which has to run
the cargo build --verbose
command. Next you Run tests
that is run
by the cargo test --verbose
command.
steps
uses actions/checkout@v2
Build
run cargo build --verbose
Run tests
run cargo test --verbose
steps
uses actions/checkout@v2
1
name Build
run cargo build --verbose
2
name Run tests
run cargo test --verbose
steps
uses actions/checkout@v2
name Build
run cargo build --verbose
name Run tests
run cargo test --verbose
Hm seems a bit tricky (you loose the name thing) and just shows how mangled and unconforming Yaml can be.
I concur that the block scopes approach is not really fitting to TAML.
IMO in YAML you can do so many unexpected things that is highly sensitive to what parses and consumes it, there might never be a 1:1 converter for YAML but only approximations that the converter has to couple very tightly to the thing that uses this particular YAML. So my first approach would make it more verbose than the YAML with almost the same parsing cost for the thing that would consume such a TAML thing.
A 100% 1:1 conversion only would be possible if we allow that TamlKeyValuePair
s can have child elements like TamlArray
and all the others. I am not sure if we should want this because my first example is pretty self explanatory.
The thing that bothers me here is that we would face the same issue with JSON. Consider the following:
{
"steps": [
{
"uses": "actions/checkout@v2"
},
{
"name": "Build",
"run": "cargo build --verbose"
},
{
"name": "Run tests",
"run": "cargo test --verbose"
}
]
}
We once again see an array of objects, or an array of arrays of key-value pairs. This is something that I've seen a few times and not just in a github action. Am I reading that the conversion of this or the above YAML to TAML would not round-trip and if so is that going to be an issue given that the specific examlple that I'm working with is one of the examples asked for in #44? While I agree that the alternative TAML that @Stelzi79 proposes in their first example would be cleaner and superior to the current YAML I'm not convinced that we're in a position to make something that doesn't map to YAML/JSON if we're aiming for adoption.
I can see these question that we have to answer:
TamlKeyValuePair
s where the key is null
/nothing and the value is a TamlArray
and how do we notate that case?-
or just one more indentation?SchemaDefinition
s and ParserDirective
s that would tell a converter in a standardized way how it could convert the TAML seamlessly into a YAML file?I had similar thought when creating the python parser, I think we could add anonymous key (with _ ?) in list:
steps
uses actions/checkout@v2
_ name Build
run cargo build --verbose
_ name Run tests
run cargo test --verbose
I suppose that the above code in TAML would instead be:
steps
_
uses actions/checkout@v2
_
name Build
run cargo build --verbose
_
name Run tests
run cargo test --verbose
yes or :
steps
_ uses actions/checkout@v2
_ name Build
run cargo build --verbose
_ name Run tests
run cargo test --verbose
I think it should yield to the same results no?
In discussions on Sept 22, 2020 stream, we believe we can convert this workflow.yml:
steps:
- uses: actions/checkout@v2
- name: Build
run: cargo build --verbose
with:
dotnet-version: 3.1.101
- name: Run tests
run: cargo test --verbose
to the following TAML:
steps
1
uses actions/checkout@v2
2
name Build
run cargo build --verbose
with
dotnet-version 3.1.101
3
name Run tests
run cargo test --verbose
You could use this pseudo-code to navigate the collection:
steps[0].uses
steps[1].name == "Build"
steps["2"][0].Value == "Build"
What would you do if you wanted to insert an element between 1 and 2? In this example it wouldn't be an issue, however in a larger example I could see the ordering going out of sync easily and causing issues. Even here steps["2"] === steps[1]
which feels rather odd to me.
Sure @garethhubball - you could change the names of the introduced objects from 1,2,3 to "foo","bar","baz" appropriately. I would suggest that in this conversion case, the name
kvp should persist, but also copy it to the new level.
steps
uses
uses actions/checkout@v2
Build
name Build
run cargo build --verbose
with
dotnet-version 3.1.101
Run tests
name Run tests
run cargo test --verbose
I think for a manual case these are all great ideas, the issue that I'm seeing from an automatic conversion viewpoint is would something like:
steps
1
uses actions/checkout@v2
2
name Build
run cargo build --verbose
with
dotnet-version 3.1.101
3
name Run tests
run cargo test --verbose
with no compiler directives convert to the YAML:
steps:
- uses: actions/checkout@v2
- name: Build
run: cargo build --verbose
with:
dotnet-version: 3.1.101
- name: Run tests
run: cargo test --verbose
or to:
steps:
1:
uses: actions/checkout@v2
2:
name: Build
run: cargo build --verbose
with:
dotnet-version: 3.1.101
3:
name: Run tests
run: cargo test --verbose
Or is ths something that must be implementation-defined and therefore we have to drop support for a CLI conversion tool that works both ways as suggested in #12 and others as the conversion to TAML is lossy?
Let me remind you of these definitions:
JSON - JavaScript Object Notation: JavaScript Object Notation is an open standard file format, and data interchange format, that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value).
YAML: YAML (a recursive acronym for "YAML Ain't Markup Language") is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted.
TAML is IMO a dedicated markup language for configuration and doesn't have the express purpose of being able to serialize every possible object there might be out there.
@garethhubball my understanding is that yaml and alike are always implementation-defined in that they only map correctly to the "thing" (GitHub workflows here in this example) they are used for. So the converter has to know what and how to make a correctly formatted yaml file for that "thing".
I also could imagine that @garethhubball last two yaml examples conversions are both valid ones. The first one is only valid for GitHub workflows and the second one would be perhaps for a completely different "The Thing X" valid.
//#TAML.Schema github_workflows.schema.taml
steps
1
uses actions/checkout@v2
2
name Build
run cargo build --verbose
with
dotnet-version 3.1.101
3
name Run tests
run cargo test --verbose
In this the schema would define a way for a converter to convert it to yaml file that is valid for GitHub Workflows
//#TAML.Schema the_thing_x.schema.taml
steps
1
uses actions/checkout@v2
2
name Build
run cargo build --verbose
with
dotnet-version 3.1.101
3
name Run tests
run cargo test --verbose
The schema in this case would omit the special conversion notation completely and would work "out of the box" with "The Thing X".
What I'm reading here is that interoperability with JSON and YAML aren't seen as major priorities here and can be sacrificed in order to make things more simple to read? This is something that I like and as such I think I've been asking the wrong questions and tackling this the wrong way.
Instead of 1:1 conversions that would work every time are we looking to create a TAML configuration file such that (for example) a Github Actions YAML file could be created from it using the objects that TAML parses into and a tool that understands that object and can use it to generate the specific YAML for Github Actions?
Should this be the case then I feel that we have what we need in TAML with no further extensions as we could do something like:
steps
uses actions/checkout@v2
Build
run cargo build --verbose
with
dotnet-version 3.1.101
Run tests
run cargo test --verbose
Which is something that has all the same meanings, for me it reads better than the other options and I could write something to convert the TAML parser output into something that a YAML generator could use to create the Github Actions specific YAML without too much trouble.
I would favor a workflow that basically states that first consider to make make it as clear as possible of a configuration Taml
file and then put specially needed conversion logic into the schema.
So in these examples it should this used
//#TAML.Schema github_workflows.schema.taml
steps
uses actions/checkout@v2
Build
run cargo build --verbose
with
dotnet-version 3.1.101
Run tests
run cargo test --verbose
and
//#TAML.Schema the_thing_x.schema.taml
steps
uses actions/checkout@v2
Build
run cargo build --verbose
with
dotnet-version 3.1.101
Run tests
run cargo test --verbose
while "The Thing X" would preferably use Taml. If these can't be converted 1:1 to Json or Yaml the schema tells a converter how to convert it correctly.
Ahhh, now I understand, so the first line that is "commented out" is a directive that points to a TAML file that defines the conversions? I see the benefits to that in the future without a doubt but I'm thinking that might add complexity that we don't need to add yet and the implementation-defined route might be simpler for now.
I'm getting some callbacks to XSDs and XSLT here but I suspect that's not what you mean and it's something far more simple and elegant than those.
Yes that is exactly my opinion and not only do I see the schema file for conversion but also for some fancy things like checking validity and hinting in such things like IntelliSense.
For now I see the implementation-defined converter as the simplest thing to do for now
^^ THIS ^^
That sells me on the concepts and need for schema. Let's put schemas out there on a second milestone after the initial parser features.
To resolve this discussion, we have decided that inconsistent indenting will create an error scenario. This decision needs to make its way into a specification document.
Additionally, the discussion around translation of documents from JSON, XML, or YAML should make it into a specification as well, indicating that we are not seeking 100% compatibility and that some information will be lost in converting to and from those formats.
How should this document be parsed:
Should this be:
root: [ (key1: value1), value2, value3 ]
root: [ key1: [value1, value2], value3 ]
root: [key1: [value1, [value2]], value3]