willghatch / racket-rash

The Reckless Racket Shell
http://rash-lang.org
Other
551 stars 31 forks source link

Regexes in grep #67

Open Bost opened 4 years ago

Bost commented 4 years ago

Hi, I have a problem using regexes in grep. Following examples don't terminate when started from rash-repl:

grep -i "computer.*store"  purchases.csv
grep -i "computer.*?store" purchases.csv
grep -i "computer.+?store" purchases.csv

My purchases.csv contains:

"amount","where"
"100"   ,"supermarket"
"200"   ,"computer store X"
"300"   ,"petrol station"
"400"   ,"computer store Y"
"500"   ,"retail store"

I tried to look at the source code, but I'm not sure:

Could you shed some light on this spot please? Thanks

willghatch commented 4 years ago

I'm afraid it's another case of glob expansion. The * and ? characters turn on glob expansion, which doesn't match anything and returns an empty list for the middle argument, so you get grep -i purchases.csv, which is reading from stdin. You can stop it by hitting C-d.

If you quote the string, eg. grep -i '"computer.*store" purchases.csv, you'll get the behavior you expect.

This is another example of where you probably don't want auto-globbing. But perhaps I should add an option to behave more like Bash and use the glob's literal string when a glob matches nothing. I somewhat hate that behavior -- it does weird things in inconsistent ways as well (such as when you actually DO have a file that matches the glob when you meant it as a regexp for grep).

On Fri, Jan 10, 2020 at 05:50:40AM -0800, Bost wrote:

Hi, I have a problem using regexes in grep. Following examples don't terminate when started from rash-repl:

grep -i "computer.*store"  purchases.csv
grep -i "computer.*?store" purchases.csv
grep -i "computer.+?store" purchases.csv

My purchases.csv contains:

"amount","where"
"100"   ,"supermarket"
"200"   ,"computer store X"
"300"   ,"petrol station"
"400"   ,"computer store Y"
"500"   ,"retail store"

I tried to look at the source code, but I'm not sure:

  • Is there a bug in the current run-pipeline implementation (or somewhere else)?
  • Or is there anything, some expansions which are not implemented yet?

Could you shed some light on this spot please? Thanks

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/willghatch/racket-rash/issues/67

kisaragi-hiu commented 4 years ago

For reference, fish returns error if a glob doesn't match anything, instead of silently discarding the argument (the case here) or becoming a literal string (bash).

u0_a135@localhost ~> ls does*not*match
fish: No matches for wildcard 'does*not*match'. See `help expand`.
ls does*not*match
   ^
u0_a135@localhost ~>

This is neither confusing or surprising, and is the best behavior in my opinion.

capfredf commented 4 years ago

I think auto-globbing might need to be turned off when * or ? appears in a regular expression

willghatch commented 4 years ago

@kisaragi-hiu Erroring when a glob matches nothing is probably the best behavior. The docs promise some backwards compatability at this point, so maybe that will have to be an option when creating a custom unix pipe. But I'll definitely put in an option for that, and make it the default behavior of globs in any Rash2 and other Rash derivatives that I plan on making.

@capfredf The problem is that the shell doesn't know that the string is meant as a regexp, only grep does. Another program might process the same string differently, and the shell can't know (without code specific to each command) how to treat arguments to different programs. You can make custom code for any command by making an alias, but I don't think it would be a good idea to include a bunch of command-specific code in the core of Rash. Honestly I think requiring explicit globs is probably a better design, but for some reason I decided to make the default unix pipe's behavior closer to more traditional shells in this area.

capfredf commented 4 years ago

@willghatch Thanks for your explanation. is it possible to automatically turn off glob expansion when * or ? is surrounded by double quotes? Just like how they behave in bash or zsh?

In zsh:

➜  wonks.github.io git:(master) ls *.yml
_config.yml
➜  wonks.github.io git:(master) ls "*.yml"
ls: *.yml: No such file or directory

In rash

> ls *.yml
/Users/capfredf/code/wonks.github.io/_config.yml
14:09 [master] /Users/capfredf/code/wonks.github.io/
> ls "*.yml"
/Users/capfredf/code/wonks.github.io/_config.yml
14:09 [master] /Users/capfredf/code/wonks.github.io/

i.e, treat "" as '. Maybe it is not a design choice but I think it is more in line with users' expectation when transitioning from zsh or bash to rash

capfredf commented 4 years ago

just out of curiosity, is the repo of Rash 2 open to public access?

Bost commented 3 years ago

PING... Any news regarding the Rash2?

willghatch commented 3 years ago

I'm sorry, I thought I had replied to this.

There isn't actually a repository for Rash 2 right now. Rash 2 is really just a list of things that I'm not satisfied with in current Rash that to fix would be a backwards incompatibility.

Before I ever start a real Rash 2, there are a lot of improvements that I have planned that are not backwards incompatible (eg. improving stuff that is currently undocumented or explicitly documented as being unstable). When I do write a Rash 2, it will probably be a fork where I just fix all of those things. In particular, when I make a Rash 2 it will have much more thought about the default set of operators, naming conventions, etc, that I chose somewhat hurriedly to get something out the door. Rash 2 will be an opportunity to revisit those decisions with more experience and less haste. Rash 2 will (when it exists) be a separate Racket package so all old scripts will still work.

Unfortunately, I'm just busy with other things and don't have the time and energy to really push Rash forward much right now. Grad school is busy and stressful. I wish I could work more on Rash right now (frankly, I care much more about Rash than my other projects), but I really need to push on other things to graduate.

On Mon, Mar 01, 2021 at 09:13:15AM -0800, Bost wrote:

PING... Any news regarding the Rash2?

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/willghatch/racket-rash/issues/67#issuecomment-788117576

Bost commented 3 years ago

No rush :) just a little bit of pressure from over here :) Anyway thanks for a prompt answer.