Closed EliorMachlev closed 2 months ago
I understand it have a valid reason for the addition, i belive it is related to this fix: https://github.com/cloudbase/powershell-yaml/issues/38
But we should have the option to "opt-out" from this.
Current Workaround for those who need it: Set a string before your starting double quote such as "TOBEREMOVED" and after conversion, using get-content replace it (-replace 'TOBEREMOVED','')
Or as a single-liner:
$YourDataVariable = 'TOBEREMOVED"Hey'
Set-Content -Path $YamlPath -Value ((ConvertTo-Yaml -Data $YourDataVariable -Force) -replace 'TOBEREMOVED','')
Result:
'Hey
Hi!
In powershell terms, the following is a literal string:
$myVal = '"Hi'
When converting to yaml, we need to enclose it in single quotes. It contains a single, double quote and if we don't, we'll generate an invalid yaml.
The following is also a literal string:
$myVal = '"hi"'
The quotes become part of the string. In this case, if we were not to enclose it is single quotes, we would have a valid yaml, but it would be incorrect. The variable in powershell contains a string which has quotes as part of the string. For example:
PS /home/gabriel> $a = '"Hei"'
PS /home/gabriel> $b = 'Hei'
PS /home/gabriel> $a -eq $b
False
And converting that string to yaml would equate to:
PS /home/gabriel> $YourDataVariable = '"Hey"'
PS /home/gabriel> ConvertTo-Yaml -Data $YourDataVariable
'"Hey"'
Same thing happens in python as well:
>>> data = '"hi"'
>>> print(yaml.dump(data))
'"hi"'
If I use your example:
$YamlPath = "/tmp/test.yaml"
$YourDataVariable = 'TOBEREMOVED"Hey'
Set-Content -Path $YamlPath -Value ((ConvertTo-Yaml -Data $YourDataVariable -Force) -replace 'TOBEREMOVED','')
I end up with a file containing an invalid yaml, which cannot be loaded in any other parser:
>>> a = open("/tmp/test.yaml")
>>> yaml.safe_load(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 125, in safe_load
return load(stream, SafeLoader)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/__init__.py", line 81, in load
return loader.get_single_data()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/constructor.py", line 49, in get_single_data
node = self.get_single_node()
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/composer.py", line 35, in get_single_node
if not self.check_event(StreamEndEvent):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/parser.py", line 98, in check_event
self.current_event = self.state()
^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/parser.py", line 142, in parse_implicit_document_start
if not self.check_token(DirectiveToken, DocumentStartToken,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 116, in check_token
self.fetch_more_tokens()
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 251, in fetch_more_tokens
return self.fetch_double()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 655, in fetch_double
self.fetch_flow_scalar(style='"')
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 666, in fetch_flow_scalar
self.tokens.append(self.scan_flow_scalar(style))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 1151, in scan_flow_scalar
chunks.extend(self.scan_flow_scalar_spaces(double, start_mark))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/yaml/scanner.py", line 1238, in scan_flow_scalar_spaces
raise ScannerError("while scanning a quoted scalar", start_mark,
yaml.scanner.ScannerError: while scanning a quoted scalar
in "/tmp/test.yaml", line 1, column 1
found unexpected end of stream
in "/tmp/test.yaml", line 3, column 1
If I just do:
$YamlPath = "/tmp/test.yaml"
$YourDataVariable = '"Hey'
ConvertTo-Yaml -Data $YourDataVariable -Force -OutFile $YamlPath
This results in a valid yaml:
gabriel@arrakis:~$ cat /tmp/test.yaml
'"Hey'
Which can be loaded in other parsers:
>>> a = open("/tmp/test.yaml")
>>> yaml.safe_load(a)
'"Hey'
But there is a chance that I have not understood the issue here. Would you mind adding a complete yaml sample and code you used to try to convert it?
@gabriel-samfira Hi, yes you understood the issue correctly. "Hey" and "Hey and 'Hey and 'Hey' all being wrapped around with extra quotes. Such as '"Hey"'.
I think we should have the option to opt out of validation and just save as-is. Basiclly, setting the responsibility on the user/developer.
Basically with NewRelic it works like that: '"Hey"' will be read as "Hey" (just like in Powershell, it reads it as literal string) "Hey" will be read as "Hey" (Again, just like in Powershell, the quotes will be removed and only the content will remain)
I think we need to set some expectations.
If a string contains quotes as a part of that string, no parser would (or should) ever strip them away before serializing to yaml. This is important because if the quotes exist within the string, they may have a purpose that a parser cannot make assumptions against.
For example:
$literalQuotes = '"Hey"'
PS /home/gabriel> $literalQuotes.Length
5
Is a very different string than:
$justQuotes = "Hei"
PS /home/gabriel> $justQuotes.Length
3
And is different from:
# This one can't be serialized to YAML without single quotes,
# otherwise, the generated yaml will be invalid and cannot be
# imported into any yaml parser.
$aSingleQuote = '"Hey'
$aSingleQuote.Length
4
A YAML parser will never strip those quotes away. If it does, than you probably shouldn't use that parser. The better approach would be, as you suggested, to leave the option of stripping away those quotes, to the author of the application. This can easily be done before you send the object to be serialized to yaml:
PS /home/gabriel> ConvertTo-Yaml $literalQuotes.TrimStart('"').TrimEnd('"')
Hey
But if you do this, you need to take into account that the generated YAML will contain a different value than the original string that you had. Your application, and whatever you integrate with, will need to accept the mutated data:
PS /home/gabriel> $imported = ConvertTo-Yaml $literalQuotes.TrimStart('"').TrimEnd('"') | ConvertFrom-yaml
PS /home/gabriel> $imported -eq $literalQuotes
False
PS /home/gabriel> $imported
Hey
PS /home/gabriel> $literalQuotes
"Hey"
From a programming perspective, in the $literalQuotes
example, the actual quote character "
is no different than the alphanumeric characters.
I have a yaml file used by a popular software (NewRelic).
When converting (Convetto-Yaml) if the node value starts with a double quote (") it automatically adds starting and closing single quotes to the value.
Example:
Value: "Hey" After Conversion: '"Hey"'
Expected: have the option to keep values as is