colis-anr / morbig

A static parser for POSIX Shell
Other
190 stars 8 forks source link

Strip tilde prefixes from `WordTildePrefix` #164

Closed Niols closed 1 year ago

Niols commented 1 year ago

fix #163

In the example (taken from the tests) X=~kn:~/bin:~darkvador/secret as an assignment word, Morbig parses the tilde prefixes as WordTildePrefix "~kn", WordTildePrefix "~" and WordTildePrefix "~darkvador". Since ~ is already contained as an information in WordTildePrefix, then it should be stripped off the carried string. This PR implements that.

Niols commented 1 year ago

While investigating this, I actually found a few more bugs, namely:

  1. In the example above, X=~kn:~/bin:~darkvador/secret (as an assignment word), Morbig actually forgets both about the / and the : characters. To some extent, they can be guessed, but this is not very clean.

  2. In the example Y=~kn:/bonjour:~/bonsoir:and/this/one (as an assignment word), Morbig considers that there is a WordTildePrefix "" before /bonjour and a WordTildePrefix "and" before /this/one.

  3. In the example Z=/bonjour:~kn:~/bonsoir:and/this/one (as an assignment word), Morbig does nothing while it should process and find the tilde prefixes ~kn and ~ before /bonsoir.

https://github.com/colis-anr/morbig/pull/164/commits/88d6a2c480508b2d34837c12ad8374c9f41c1420 introduces the two additional tests for Y and Z and the fixes the expectation file that goes with them. Of course, at this point, tests don't pass. The coming commits should fix that. I also expect that the expectation file might change a bit, but only to split some WordLiteral into several ones. We shall see.

Niols commented 1 year ago

The last commit adds two more test cases for words with tildes that are not assignment words. This is to check that Morbig is not overzealous in those cases and that it indeed does not find tilde prefixes in the middle of a colon-separated word.

Niols commented 1 year ago

Alright, I think this is starting to look pretty good. However, I am quite puzzled by the way Morbig recognizes assignment words. In particular, Morbig recognizes assignment words on the right-hand side of a command name and I don't think this is right, but I have to read the standard some more before deciding what to think. That is not what dash does, in any case. I added some test cases but I don't think changing Morbig's behaviour is within the scope of this PR. I will open an issue to track this.

Niols commented 1 year ago

Had to merge main into this to resolve the conflicts with #167. I suggest squashing this PR.

Niols commented 1 year ago

@yurug Merged #169 into main and main into this PR, so now we have our fancy test suite running here. This PR is ready for review.

yurug commented 1 year ago

LGTM. I only add minor comments.

Niols commented 1 year ago

The CI problem comes from the Archlinux image not managing to fetch jq. I don't think it is blocking so I will leave it at that. @yurug WDYT of the new version?

yurug commented 1 year ago

LGTM!