Closed Sveder closed 2 years ago
The reason why sort
exists is historical, there was no shellpipe
filter when sort
was introduced.
The reason why your uniq
with shellpipe didn't work is most likely because you didn't sort it before (uniq
in Unix requires lines to be sorted).
So for line-based uniqueness, I think it's fine to just sort it first (try shellpipe: 'sort -u'
or shellpipe: 'sort | uniq'
).
What might be interesting is a kind of "unique / sort" kind of filter that works on e.g. CSS or XPath selectors, as this is kind of hard to do with built-in unix commands.
Feel free to turn this PR into a documentation change that shows how to use shellpipe + sort -u
or sort | uniq
to filter duplicate lines.
@thp thanks for answering. You might be right that I didn't play with shellpipe: uniq
enough, and indeed now that I read uniq
docs it is not what I wanted as I don't want to sort the data.
In this case, I think my implementation of uniq
is definitely simpler and more intuitive than some of the awk
one liners that stack overflow suggests to uniq without sort.
Is this use case still not interesting enough to be merged?
Yeah I think it's valid, but rename it remove-duplicate-lines
or something so that it's easier for non-Unix people and so that there's no confusion with the Unix uniq
tool which works slightly differently.
@thp updated the name as per your suggestion.
@thp made the changes :)
Changes made.
Please mark as ready for review + update changelog + squash to a single commit and then we can merge this.
@thp done.
@thp wdyt? I tried using
shellpipe: uniq
(didn't work) but makes sense to me to have this as a first class assort
exists.(Draft as I didn't add docs, if you'll ok this I'll add them and un-draft)