Closed drvink closed 9 years ago
Thanks, I'm going to need to write a test suite soon.
I think expanding with quoting works, because braceExpand
looks for unquoted braces (Char '{'
and Char '}'
).
To get a Word
, it's easiest to use Text.Parsec.parse Language.Bash.Parse.Word.word
. Is this common enough to have a parseWord
function either in Language.Bash.Parse
or Language.Bash.Parse.Word
?
Should be fixed now.
> let testExpand s = map Language.Bash.Pretty.prettyText $ Language.Bash.Expand.braceExpand (case Text.Parsec.parse Language.Bash.Parse.Word.word "" s of Right w -> w)
> testExpand "{a,b}"
["a","b"]
> testExpand "{{a,b}"
["{a","{b"]
> testExpand "{a\\{,b\\,c}"
["a\\{","b\\,c"]
While that one's now fixed, it looks like there's more cases that don't work. I spent an embarrassingly long time trying to implement a parser for this myself--the grammar is so hideous that the easiest way to get a working clone might very well be to directly translate bash's imperative mess.
I barely speak Haskell, but here's some QuickCheck code to exercise the brace expander. (I originally wrote this in F# with FsCheck and then had to figure out how to translate it.)
import System.Process (readProcess)
import Text.Regex (subRegex)
import Text.Regex.TDFA (makeRegex, match)
import Test.QuickCheck
import Test.QuickCheck.Monadic (assert, monadicIO, run)
import Language.Bash.Expand (braceExpand)
import Language.Bash.Parse.Word (word)
import Language.Bash.Pretty (prettyText)
import Text.Parsec (parse)
testExpand :: String -> [String]
testExpand s =
map prettyText $ braceExpand (case parse word "" s of Right w -> w)
charset :: Gen String
charset = elements strs
where
strs = ["", ",", "\\{", "\\}", "\\,", "\\ "] ++
map (:[]) ['a'..'z']
junk :: Int -> Gen String
junk sz = do
xs <- resize sz $ listOf1 charset
return $ foldr (++) "" xs
maybeBraced :: Gen String
maybeBraced = do
s <- junk 10
frequency [(5, braced s),
(1, llbraced s),
(1, lrbraced s),
(1, rlbraced s),
(1, rrbraced s),
(1, str_only s)]
where
braced s = return $ "{" ++ s ++ "}"
llbraced s = return $ "{" ++ s
lrbraced s = return $ "}" ++ s
rlbraced s = return $ s ++ "{"
rrbraced s = return $ s ++ "}"
str_only = return
maybeStr :: Gen String
maybeStr = oneof [blank, maybeBraced]
where
blank = return ""
braceExpr :: Gen String
braceExpr = sized braceExpr'
where
braceExpr' 0 = maybeStr
braceExpr' n
| n > 0 = do
x <- braceExpr' $ n `quot` 2
y <- braceExpr' $ n `quot` 2
return $ x ++ y
| otherwise = error "wao"
expandWithBash :: String -> IO String
expandWithBash str = do
expn <- readProcess "/bin/bash" ["-c", "echo " ++ str] ""
return $ filter (\x -> x /= '\r' && x /= '\n') expn
prop_expandsLikeBash :: String -> Property
prop_expandsLikeBash str = monadicIO test
where
re = makeRegex "\\\\(.)"
test = do
bash <- run $ expandWithBash str
let expn = testExpand str
joined = unwords expn
m = match re joined
check = if m then subRegex re joined "\\1" else joined
run $ putStrLn bash
run $ putStrLn check
assert $ bash == check
main :: IO ()
main = quickCheck $ forAll braceExpr prop_expandsLikeBash
I'm still working, and I've gotten pretty close. I've switched from using Parsec to a Word -> [(a, Word)]
type parser, and the problem is getting the parses in the right order to the one we want comes first.
For reference, bash's algorithm is here, though reading it may result more in confusion than in enlightenment.
/* We ignore an open brace surrounded by whitespace, and also
an open brace followed immediately by a close brace preceded
by whitespace. */
This is exactly what I needed. Thanks!
Since the bash expansion algorithm knows about quoting and escaping, this would need to be made aware of those too. (It's really ugly...)