Closed marcospb19 closed 3 years ago
Nowadays almost all editors supports windows like line-ending, so I think if a file has it, and it is followed by a "\n" it should be have the same behavior as "\n"
There is something specifying that in the -s
should consider only ASCII in the specification? If there isn't I think it should handle UTF-8 gracefully, otherwise, it would be a cool extension a new flag that handles UTF-8
I didn't had the time to look a the code yet, but I'll check as soon as possible
@GrayJack, I had some time to work in fixing -s and implementing -f similar the GNU
implementation of uniq
.
This means that "\n" is different than "\r\n", and -s|skip-chars=N
skips bytes instead (a char from the C language).
I'm sorry I don't have the time to change this :/ so I suggest that you could create an issue for each one if you want to.
bors r+
Build succeeded:
There are some decisions I have to ask you first before completing and asking for a complete review:
uniq
ignore the trailing'\n'
at the end of the lines, but does not ignore'\r'
, what is the desired behavior for me to implement? (I know that this project is only meant to be used in Windows, but \r\n files are still possible).-s
flag, commonly named--skip-chars
, does not skip UTF-8 chars, but bytes instead (other implementations), see this example:Uniq will display the 3 lines received.
The problem when comparing different slices of bytes is that:
Throws a runtime panic, because of the multi-byte char
'ã'
.Should I just work this around by using
.as_bytes()
instead of string slices?About
-s
previous errors, the output wasn't showing the full line that first matched with the "pattern". I'll fix tests soon.