Open jonasfj opened 1 week ago
@kekavc24 I'm not sure this is a new bug caused by https://github.com/dart-lang/yaml_edit/pull/87 but it's something we should fix.
At a minimum we should fallback to double quoted string encoded, if nothing else.
Indeed, I'd suggest the following steps:
_tryYamlEncodeLiteral
and _tryYamlEncodeFolded
for strings that end in \n \n
and similar.
if (string.trimRight() != string) return null
I think (1) is the most important. if (string.trimRight() != string) return null
is a big hammer, and maybe we can do better, but it's a good start, and it's better to safe than wrong :rofl:
Btw, I can be found on dart_community discord, I'm jonasfj, feel free to ping me.
Adding the test case
'whitespace between line breaks at the end\n \n'
totest/string_test.dart
causes tests to fail.diff --git a/test/string_test.dart b/test/string_test.dart index b92c20a..a18f57a 100644 --- a/test/string_test.dart +++ b/test/string_test.dart @@ -10,6 +10,7 @@ final _testStrings = [ "this is a fairly' long string with\nline breaks", 'whitespace\n after line breaks', 'whitespace\n \nbetween line breaks', + 'whitespace between line breaks at the end\n \n', '\n line break at the start', 'word', 'foo bar',
Tests failures:
$ dart test 00:01 +400 -1: loading test/golden_test.dart Successfully tested 20 inputs against golden files, created 0 golden files 00:01 +400 -1: test/string_test.dart: Root FOLDED string (4) [E] Assertion failed: (package:yaml_edit) Modification did not result in expected result. # YAML before edit: > # YAML after edit: > >- > whitespace between line breaks at the end > > Please file an issue at: https://github.com/dart-lang/yaml_edit/issues/new?labels=bug package:yaml_edit/src/editor.dart 579:7 YamlEditor._performEdit package:yaml_edit/src/editor.dart 249:14 YamlEditor.update test/string_test.dart 39:20 main.<fn> To run this test again: dart test test/string_test.dart -p vm --plain-name 'Root FOLDED string (4)' 00:01 +415 -2: test/string_test.dart: Root LITERAL string (4) [E] Assertion failed: (package:yaml_edit) Modification did not result in expected result. # YAML before edit: > # YAML after edit: > |- > whitespace between line breaks at the end > > Please file an issue at: https://github.com/dart-lang/yaml_edit/issues/new?labels=bug package:yaml_edit/src/editor.dart 579:7 YamlEditor._performEdit package:yaml_edit/src/editor.dart 249:14 YamlEditor.update test/string_test.dart 39:20 main.<fn> To run this test again: dart test test/string_test.dart -p vm --plain-name 'Root LITERAL string (4)' 00:10 +624 -2: Some tests failed. Consider enabling the flag chain-stack-traces to receive more detailed exceptions. For example, 'dart test --chain-stack-traces'.
Sorry for this. It seems I overwrote it as I was resolving some merge conflicts I encountered while rebasing.
I wanted to drop the chomping indicator changes I made in between for #87 so that the squashed commits make sense when merged.
I’ll work on it 🫡
I have a branch with the changes you suggested, I'll make a PR.
@jonasfj Additionally, I also managed to come up with a PoC that solves this issue permanently (hopefully) but comes with some breaking but solvable trade offs in another branch.
After we wrongly tried using keep chomping indicator
in #87, it got me wondering why it existed in YamlEditor
in the first place. What issue was it trying to solve before my fix in #87?
The investigation led me back here 😅 , to this very issue we are trying to solve. The chomping indicator was meant to preserve the trailing \n
in the initially buggy _tryYamlEncodeFolded
and _tryYamlEncodeLiteral
with hits and misses before my fix. And now it won't work. How come and why?
I investigated yamlEncodeBlockScalar
which calls _tryYamlEncodeFolded
and any other function that encodes block scalars. It looked okay. You helped me refactor it. The culprit ended up being yamlEncodeBlock
which calls yamlEncodeBlockScalar
.
If a YamlNode
is a :
YamlScalar
, it calls yamlEncodeBlockScalar
and returns the encoded string.YamlList
or YamlMap
, it recursively calls yamlEncodeBlock
until all YamlScalar
s are encoded and joins then with a \n
and returns the string.No issue, right? Well, that's where the issue is.
It means our YamlScalar
will never be given the chance to have an additional \n
we need for _tryYamlEncodeFolded
and _tryYamlEncodeLiteral
to prevent issues.
Okay, issue found. Just apply the additional \n
we need. I did this and quickly ended up having duplicate \n
or even \n
x where x
is the number of YamlList
or YamlMap
are encountered before the next YamlScalar
is found.
So I refactored that function to apply line-breaks correctly. After this 2 issues bubbled up:
YamlScalar
s that aren't folded/stripped. (I expected this but not 1 above 😅)Once this occurred, I backed off to see if this was a known issue that needed fixing. That's when I found #40. In the issue, YOU outline ways we can/should handle this which ended up being what I had in mind. Thanks! (for the incredible foresight)
However, a few issues you overlooked that ended up being the trade-offs I mentioned:
lazily
- Should we just retain the white space we skipped after scanning ahead for comments and not finding any?greedily
- Discard any white space we find ahead irrespective of whether we found comments or not.I ended up looking for and skipping comments greedily. At least to have a PoC that works. Armed with this, I continued my investigation started in issue one above.
I noticed that both _replaceInBlockMap
and updateInList
only look ahead up-to the first comment. While refactoring code here, I realised:
yamlEncodeBlock
intentionally leaves out the trailing line-break we needed in issue one above and applies it later but only in updateInList
and never for top-level scalars or _replaceInBlockMap
. A bug which we can't solve here. Only at the yamlEncodeBlock
levelYamlList
or YamlMap
being mutated.At this point, the issues I had were:
I implemented the function that does that and also added some tricks when adding folded/literal strings.
It would be nice if you are able to review it and make suggestions for improvements. I can make a draft PR.
@jonasfj sorry for the extremely long write up. I didn't want to create a PR without outlining the PoC itself since I make so many changes to the existing code without redefining its intention that may need your review.
Sorry for this. It seems I overwrote it as I was resolving some merge conflicts I encountered while rebasing.
Yeah, the PR was getting pretty complicated, we should probably have done fewer things. No worries, we can still fix bugs now :D
Also it's not like there weren't any bugs before this PR :see_no_evil: :rofl:
Okay, this might take a while to review.
Thinking about it, I do wonder, is it really a good idea to use folded or literal strings without chomping -
?
Certainly the +
modify is a but dangerous, in scenarios like:
Example 1
A: |+
foo
B: true
B
could change the value of A
if we're not careful.Example 2
A: |+
foo
B: true
, could change the value of A
if we're not careful.I guess what I'm asking is? Instead of trying to handle cases like these correctly, would it be better if we just fallback to use "foo\n\n"
.
In practice I suspect it's rare that yaml_edit
will be asked to deal with these things. So it might be better if we have a package that can do less, but do it correctly every time.
If we have to fallback to double quoted encoding, that's not nearly as bad as throwing a runtime exception because of an internal error, which is what will happen if we modify the YAML document incorrectly (_performEdit
makes sure of that).
I'm just asking the question here:
yaml_edit
to do?yaml_edit
will be asked to add literal or folded strings often?yaml_edit
figure out that trimming the strings make yaml_edit
produce nicer looking output)I guess what I'm asking is? Instead of trying to handle cases like these correctly, would it be better if we just fallback to use
"foo\n\n"
.
True. That's why I still fielded #91 that should precede #90. #90 is just a PoC that I believe can get better once you take a look and help me nitpick the rough edges.
Certainly the
+
modify is a but dangerous, in scenarios like:
That's why I never attempt to use it at any point within the PoC. I discovered a nifty trick that I strongly highlighted and documented. I was surprised it passes most existing tests including random_test.dart
since the PoC ensures no dangling comments are left behind.
In practice I suspect it's rare that
yaml_edit
will be asked to deal with these things. So it might be better if we have a package that can do less, but do it correctly every time.
Yeah, that makes sense
Adding the test case
'whitespace between line breaks at the end\n \n'
totest/string_test.dart
causes tests to fail.Tests failures: