elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.65k stars 299 forks source link

Make eawk accept a custom separator #1728

Closed saolof closed 9 months ago

saolof commented 11 months ago

One of the biggest pain points of using elvish as a shell in practice for me has been parsing output from shell commands that return text.

Many CLIs are parseable by AWK. Elvish provides eawk to provide some of the key tasks that AWK does without introducing a separate DSL, relying solely on the expressiveness of lambdas.

In practice, most of the power of AWK is being able to change the separator to parse the particular output of the command you pipe into it (commas, pipe, spacing strictly longer than two spaces, etc etc). So this commit adds an optional sep argument that takes a regex pattern.

(I also updated the dockerfile.)

krader1961 commented 11 months ago

It's great seeing someone new take the time to create a pull-request. But you also need to update the unit test to verify the new functionality. See the TestEawk function in builtin_fn_str_test.go.

saolof commented 11 months ago

Yes, I am putting together a couple of examples. I think sample outputs from psql and from docker image ls should work?

saolof commented 11 months ago

@krader1961 Added the docker image ls test.

xiaq commented 9 months ago

Thanks for the contribution.

I find the name of the options (&sep and &posix) potentially confusing though. It's not clear that &sep should be a regular expression (instead of a literal string to pass to str:split) and that &posix refers to the syntax of the regular expression.

I think it's best to move the command into the re: module altogether and have it named re:awk instead - the option names becomes unambiguous this way. The eawk name can remain now as a deprecated alias and be removed later.

I'll merge this to a temporary branch which I'll integrate to master later.

Maybe a re:sed is also in order...