chmln / sd

Intuitive find & replace CLI (sed alternative)
MIT License
5.72k stars 136 forks source link

`\s+$` matches newline at the end of the file and newlines on empty lines #291

Open kbvw opened 6 months ago

kbvw commented 6 months ago

From Example 2 in the quick guide ("let's trim some trailing whitespace"), sd '\s+$' '' seems to eat the newline at the end of the file, and newlines on empty lines, but not the newlines on the other lines.

Example: echo -en "test\n\ntest\n" | sd '\s+$' '' results in "test\ntest". Just wondering if this is intended behavior? It strips the whitespace alright, but I did not expect it to touch those newlines.

The equivalent sed command behaves as I expected: echo -en "test\n\ntest\n" | sed -E 's/\s+$//g' leaves this input unchanged.

dev-ardi commented 6 months ago

While the end newline is the posix standard sd isn't posix compliant, and it's not a goal of the project afaik.

You tell it to replace whitespaces and it does just that.

kbvw commented 6 months ago

Sure, matching the final newline makes sense. What is unexpected to me is that it seems to match them in some cases only: on empty lines, yes, on lines with non-whitespace characters, no, unless that line is the final line, then yes again. Is there some other logic that I'm overlooking?

dev-ardi commented 6 months ago

This is expected because sd doesn't work in a line-by-line mode, the answer lies in the regex.

The regex means "at least one whitespace and a newline" \n doesn't match \n matches \n\n matches. This would never match in sed because it works in a line by line basis.

kbvw commented 6 months ago

Ah, indeed I didn't realize the pattern can span multiple lines. And then I guess $ matches both newline characters and the end of the file, hence \s+$ also matches the final newline. That's clear, thanks!

Andrew15-5 commented 6 months ago

Is there a way to enable line-by line mode? Normally, you would enable multiline mode with m flag. I'm not sure if there is a reverse way. I want \s to match everything, excluding LF, and suddenly writing [\r\t\f\v ] instead of \s is too much for me.

dev-ardi commented 6 months ago

Currently no, but it's planned.