bats-core / bats-assert

Common assertions for Bats
Creative Commons Zero v1.0 Universal
94 stars 39 forks source link

Possibly wrong handling of regular expressions #53

Closed debarshiray closed 1 year ago

debarshiray commented 1 year ago

I don't know if this is a bug in bats-assert, or a mistake on my end, but it's odd enough that I decided to file it. Please let me know if I am doing things wrong.

We use regular expressions with Bats to assert SHA256 hashes in Toolbx, and I noticed that some seemingly valid extended regular expressions are being flagged as invalid. eg., this:

load 'libs/bats-support/load'
load 'libs/bats-assert/load'

@test 'assert with --regexp \"^--> [a-z0-9]*$\"; ok!' {
  run echo "--> 1e6001"
  assert_line --index 0 --regexp "^--> [a-z0-9]*$"
}

@test 'assert with --regexp \"^--> [a-z0-9]{6}$\"; ok!' {
  run echo "--> 1e6001"
  assert_line --index 0 --regexp "^--> [a-z0-9]{6}$"
}

@test 'assert with --regexp \"^[a-z0-9]*$\"; fail!' {
  run echo "1e6001aec9c911d05bbf47d76de8d9e5bbe2e7bc17dbf772eddad775e432d7c2"
  assert_line --index 0 --regexp "^[a-z0-9]*$"  # invalid extended regular expression
}

@test 'assert with --regexp \"^[a-z0-9]{64}$\"; ok!' {
  run echo "1e6001aec9c911d05bbf47d76de8d9e5bbe2e7bc17dbf772eddad775e432d7c2"
  assert_line --index 0 --regexp "^[a-z0-9]{64}$"
}

... will lead to:

$ bats test.bats
test.bats
 ✓ assert with --regexp "^--> [a-z0-9]*$"; ok!
 ✓ assert with --regexp "^--> [a-z0-9]{6}$"; ok!
 ✗ assert with --regexp "^[a-z0-9]*$"; fail!
   (from function `assert_line' in file libs/bats-assert/src/assert_line.bash, line 170,
    in test file test.bats, line 16)
     `assert_line --index 0 --regexp "^[a-z0-9]*$"  # invalid extended regular expression' failed

   -- ERROR: assert_line --
   Invalid extended regular expression: `^[a-z0-9]*$'
   --

 ✓ assert with --regexp "^[a-z0-9]{64}$"; ok!

4 tests, 1 failure

This is the relevant code in bats-assert/src/assert_line.bash:

  # Arguments.                                                                                                                       
  local -r expected="$1"

  if (( is_mode_regexp == 1 )) && [[ '' =~ $expected ]] || (( $? == 2 )); then
    echo "Invalid extended regular expression: \`$expected'" \
    | batslib_decorate 'ERROR: assert_line' \
    | fail
    return $?
  fi

Initially, I thought that this is due to some escaping problem. So, I tried them directly in an interactive Bash session, and found something odd. The specific construct used in bats-assert to catch invalid extended regular expressions still triggers, but the expression otherwise works against a string:

$ expected="^[a-f0-9]*$"
$ if [[ '' =~ $expected ]] || (( $? == 2 )); then echo "invalid"; fi
invalid
$ if [[ '1e6001aec9c911d05bbf47d76de8d9e5bbe2e7bc17dbf772eddad775e432d7c2' =~ $expected ]]; then echo "match"; fi
match

So, I tried to tease it apart and found:

$ [[ '' =~ $expected ]]
$ echo $?
0

At this point, I have no idea why (and from where) the construct used in bats-assert gets 2 as a return status, and why another similar regular expression doesn't get flagged.

martin-schulze-vireso commented 1 year ago

Looks like a bug: your regex matches the empty string, so the exit code check in the or is not even called.

That means, all regexs that Match empty string are wrongly considered invalid.

debarshiray commented 1 year ago

Thanks for fixing this, @martin-schulze-vireso !

You were right. I overlooked that my regular expression was actually matching an empty string. I should have probably used + instead of * in the regexp.

Anyway, thanks again.