[BUG] Command with `--file` considered successful even when all tests fail, JSON error appended malforms data

Command line / docker command to reproduce

testssl.sh command: testssl.sh --file /testssl.txt --mode parallel

Docker command:

  run docker run --rm \
    --user "$(id -u):$(id -g)" \
    --network cipher-list \
    --volume "${PWD}/testssl.txt:/testssl.txt":ro \
    --volume "${PWD}/results/intermediate/:/results" \
    -w "/results" \
    drwetter/testssl.sh:3.1dev --file /testssl.txt --mode parallel
  assert_success

testssl.txt:

--jsonfile-pretty port_587.json --starttls smtp mail_ssl_cipherlists:587
--jsonfile-pretty port_465.json mail_ssl_cipherlists:465

--jsonfile-pretty port_110.json --starttls pop3 mail_ssl_cipherlists:110
--jsonfile-pretty port_995.json mail_ssl_cipherlists:995

--jsonfile-pretty port_143.json --starttls imap mail_ssl_cipherlists:143
--jsonfile-pretty port_993.json mail_ssl_cipherlists:993

Errors are not caught like a single test is when using --file, which was expected, at least with --mode serial that some indication of a failure could be detected when using testing tools such as bats.

JSON output is also incorrectly appended when errors occur, making the data invalid preventing successful query as an alternative failure detection.

More details

The command works well if the json files don't already exist, but unlike a single test (instead of `--file`), the error doesn't seem to get caught by testing tool `bats`'s `run` method (runs command following it via a subshell). The output snippet from bats failure: ```sh -- output differs -- expected (1 lines): "ECDHE-RSA-AES128-SHA ECDHE-RSA-AES256-SHA DHE-RSA-AES128-SHA DHE-RSA-AES256-SHA" actual (3 lines): jq: error (at results/intermediate/port_587.json:632): Cannot iterate over null (null) parse error: Unmatched ']' at line 632, column 39 "ECDHE-RSA-AES128-SHA ECDHE-RSA-AES256-SHA DHE-RSA-AES128-SHA DHE-RSA-AES256-SHA" -- ``` This indicates that `bats` considered the previous `run` with `testssl.sh` successful, despite the error. Presumably because `testssl.sh` handled multiple tests differently by possibly running each test in their own subshells and not bubbling up failure? Errors for each were output to console when testing without `bats` involved, each had the same in output error like when running a single test. I assume the given docker command was considered successful despite at least one (or all) tests returning an error/failure. The next `bats` test following the `testssl.sh` command is a `jq` command that extracts the `cipherorder_TLSv1_1`(by matching id) `finding` value: ```sh local CIPHERLIST_RSA_INTERMEDIATE_TLSv1_1='"ECDHE-RSA-AES128-SHA ECDHE-RSA-AES256-SHA DHE-RSA-AES128-SHA DHE-RSA-AES256-SHA"' run compare_ciphersuites "cipherorder_TLSv1_1" ${RESULTS_FILE} assert_success assert_output ${CIPHERLIST_RSA_INTERMEDIATE_TLSv1_1} # ... function compare_ciphersuites() { local TARGET_CIPHERLIST=$1 local RESULTS_FILE=$2 echo $(docker run --rm --volume "${PWD}:/results" -w "/results" dwdraju/alpine-curl-jq jq '.scanResult[].fs[] | select(.id=="'${TARGET_CIPHERLIST}'") | .finding' ${RESULTS_FILE}) } ``` The failure from that test is related to `testssl.sh` appending invalid JSON to the json output (multiple runs): ```json },{ "id" : "grade_cap_reason_4", "severity" : "INFO", "finding" : "Grade capped to B. TLS 1.0 offered" } ] } ], "scanTime" : 208 } { "id" : "scanProblem", "severity" : "FATAL", "finding" : "non-empty 'port_587.json' exists. Either use '--append' or (re)move it" } ], "scanTime" : "Scan interrupted" } { "id" : "scanProblem", "severity" : "FATAL", "finding" : "non-empty 'port_587.json' exists. Either use '--append' or (re)move it" } ], "scanTime" : "Scan interrupted" } ``` Note how it appends an object `scanProblem` with `],` and `}` closing tags lacking matching opening tags creating invalid JSON. Otherwise I could workaround the lack of error bubbling up for `testssl.sh` by checking for `scanProblem` via `jq`.

Expected behavior

When running multiple tests via --file, there should be a way to allow testssl.sh to bubble up the failure like a single test (without --file) behaves, especially when run with --mode serial. This would be helpful when using testssl.sh with test suites.

Existing JSON file either should not be modified (toggle via flag?) or not append content that causes malformed JSON data, since that cannot be reliably parsed to detect the failure.

Your system (please complete the following information):

OS: Ubuntu 20.10 (Docker Host, Vultr VPS)
Platform: Linux 5.8.0-38-generic x86_64
Version: testssl.sh 3.1dev from https://testssl.sh/dev/ (Image ID: fe90513119fb, 5 days ago, digest: sha256:fca6197abd505668a1f5d6557c9652cd83edabe98f97a672c837cb9f9b1ca376)
OpenSSL: /home/testssl/bin/openssl.Linux.x86_64 ("OpenSSL 1.0.2-chacha (1.0.2k-dev)" [~183 ciphers] from Jan 18 17:12:17 2019?)

Additional context

When I run a single test, assertion correctly fails when the file exists:

drwetter/testssl.sh:3.1dev --jsonfile-pretty port_587.json --starttls smtp mail_ssl_cipherlists:587

     `assert_success' failed with status 253

   -- command failed --
   status : 253
   output (3 lines):

     Fatal error: non-empty "port_587.json" exists. Either use "--append" or (re)move it

   --

I am aware of the --overwrite option, and the actual test suite correctly removes the files. This was just an error I noticed while writing my tests that I figured I should report. This applies to any error as far as I know, eg an invalid hostname:

{
          "Invocation"  : "testssl.sh --warnings=batch --mode parallel --jsonfile-pretty port_587.json --starttls smtp xmail_ssl_cipherlists:587",
          "at"          : "771f6b131e6d:/home/testssl/bin/openssl.Linux.x86_64",
          "version"     : "3.1dev ",
          "openssl"     : "OpenSSL 1.0.2-chacha from Jan 18 17:12:17 2019",
          "startTime"   : "1613281416",
          "scanResult"  : [
                            {
                                "id"           : "scanProblem",
                                "severity"     : "FATAL",
                                "finding"      : "No IPv4/IPv6 address(es) for 'xmail_ssl_cipherlists' available"
                           }          ],
                    "scanTime"  : "Scan interrupted"
}

The docker option --user "$(id -u):$(id -g)" was required, to match the ownership of the directory(root "0:0") to write to IIRC, testssl.sh was writing as 1000:1000 and failing to write files due to permission errors. That's more of a docker issue and depends on context rather than testssl.sh though.

Hi @polarathene,

I can't really comment on most of this issue. I'm not familiar with bats and it's not clear to me how to bubble up exit codes from the individual tests, given that the individual tests may produce multiple exit codes, but the overall test can only provide one. However, I'd like to comment on the particular scenario above with respect to using the JSON file to detect the error.

I am not a JSON expect, but I can't find any way to append the error message to an existing JSON file that doesn't result in invalid JSON. Whenever I have tried appending new JSON output to an existing JSON file, the JSON checker I have used has flagged as an error the presence of additional data after the contents of the original file.

One solution would be what I tried in #1398, but that PR would simply prevent anything from being written to already existing JSON file. That would prevent the JSON file from being corrupted, but then there would be no indication of the error in the JSON file.

The only alternative I can think of would be to stop scanning entirely. testssl.sh already does this in the case that the command line contains an error. For example, if testssl.txt contained:

--jsonfile-pretty port_587.json --starttls smtp mail_ssl_cipherlists:587
--jsonfile-pretty port_465.json mail_ssl_cipherlists:465 --notanoption

--jsonfile-pretty port_110.json --starttls pop3 mail_ssl_cipherlists:110
--jsonfile-pretty port_995.json mail_ssl_cipherlists:995

--jsonfile-pretty port_143.json --starttls imap mail_ssl_cipherlists:143
--jsonfile-pretty port_993.json mail_ssl_cipherlists:993

then testing would stop as soon as the second line was parsed. The child process would for the second line would note that the command line contained an invalid option, print out the --help information, and then signal the parent to stop all testing. The parent process would immediately quit with an exit code indicating that there was an error in one of the child processes.

@drwetter : At the moment, if parse_cmd_line() in a child process find a problem and calls help(), then help() sends a USR1 signal, which causes the parent process to immediately exit. However, in many cases parse_cmd_line() calls fatal() rather than help(), and when that happens, the parent process continues. In addition, as noted above, if json_header(), csv_header(), html_header(), or prepare_logging() encounter a problem they call fatal(), which does not signal the parent process to stop. Do you think we should change this so that any problem caught in a child process by parse_cmd_line(), json_header(), csv_header(), html_header(), or prepare_logging() that results in fatal() being called also results in a signal being sent to the parent for the parent to quit with a $ERR_CHILD exit code?

I'm not familiar with bats and it's not clear to me how to bubble up exit codes from the individual tests, given that the individual tests may produce multiple exit codes, but the overall test can only provide one.

I guess in this case it depends what is expected outcome if some tests fail. If you're using this for testing such as CI, if a failure is an issue, you perhaps want any failure to raise a failure regardless of some tests passing?

I could understand if someone else is instead doing automated scheduled monitoring and would want to individually check the results of each and not have a failure short-circuit the tests or imply failure if 1-2 tests go wrong.

Perhaps the right approach for that is some command flag to alter the behaviour, eg --on-failure-stop / --stop-on-error / --e-quit-all or something along those lines..

I can't find any way to append the error message to an existing JSON file that doesn't result in invalid JSON. Whenever I have tried appending new JSON output to an existing JSON file, the JSON checker I have used has flagged as an error the presence of additional data after the contents of the original file.

Is the modified JSON actually invalid? Do you have a diff of the modification applied?

I contributed this little script to a project that uses jq to read and optionally modify a JSON file. It took a while to get familiar with jq vs easier modification with programming languages I'm used to however.

~In this case, perhaps it would make sense to have a property such as status (I haven't looked at the json output in a while, I think you already have something like this...), and updating that value from say "running" to "failed" via sed?~

It looks like you're already doing something that from a quick glance over the JSON snippet I provided, just the value you're updating is invalid due to { } and [ ] scopes being invalid:

          ],
                    "scanTime"  : 208
}
                            {
                                "id"           : "scanProblem",
                                "severity"     : "FATAL",
                                "finding"      : "non-empty 'port_587.json' exists. Either use '--append' or (re)move it"
                           }          ],
                    "scanTime"  : "Scan interrupted"
}
                            {
                                "id"           : "scanProblem",
                                "severity"     : "FATAL",
                                "finding"      : "non-empty 'port_587.json' exists. Either use '--append' or (re)move it"
                           }          ],
                    "scanTime"  : "Scan interrupted"
}

It looks like you actually wanted to update scanTime value and add the "finding" object with the others. So the finding would be added to the JSON file like other findings if it doesn't already exist, otherwise update it's values. Same with scanTime?

That would prevent the JSON file from being corrupted, but then there would be no indication of the error in the JSON file. simply prevent anything from being written to already existing JSON file. The only alternative I can think of would be to stop scanning entirely. testssl.sh already does this in the case that the command line contains an error.

That last option would probably work in my case, but as mentioned earlier in this response, may be undesirable for some users.

You could have a file-status.json file which may be simpler to edit. It could include the additional "finding" info, or just be a single value per filename (json keys?).

I'd still prefer more consistency with the expectation of using --file to match that without it when a failure occurs.

then signal the parent to stop all testing. The parent process would immediately quit with an exit code indicating that there was an error in one of the child processes.

While the JSON issue would ideally also be resolved, being able to get this behaviour would be great if it'd resolve the issue with detecting the error/failure. It seems that'll be resolved with https://github.com/drwetter/testssl.sh/pull/1871 ?

I would not like to rewrite JSON output. I consider JSON in the sense of processing through testssl.sh more like a stream: once we passed through the stream we don't want to move back the pointer. Otherwise I am afraid we're struggling against making JSON valid again under several scenarios. It is already a headache and opening another box of pandora doesn't feel right to me.

Can't you just invoke everything differently @polarathene ?

I am still in the process making my mind up whether a general termination is good or whether it is not. We need some arguing here... E.g. if you run a scan overnight, launch it, you go to bed and one hour later testssl.sh figures there's one typo in one argument: Not sure whether testssl.sh should hard fail. That is undesireable, at least when the output goes to different JSON files. A additional flag like @polarathene seemed on the first glance a good idea. However the question is then: what should be the default?

I would not like to rewrite JSON output. I consider JSON in the sense of processing through testssl.sh more like a stream: once we passed through the stream we don't want to move back the pointer.

That's fine... but you shouldn't be appending an error like shown above in this issue where it makes the JSON document invalid and can effectively "corrupt" it to parsers. Instead if a user needs be able to check the status of each run with --file, they should be able to specify a file that can be parsed which wouldn't cause any breakage?

I don't need tests to fail early, but it would be nice to know what failed, or if there was a failure (it's automated testing, so I don't mind if it's going to run all the other tests and return an error code at the end because one or more failed, if that comes with some output of what failed, awesome :)

Otherwise I am afraid we're struggling against making JSON valid again under several scenarios. It is already a headache and opening another box of pandora doesn't feel right to me.

That's ok I understand that as you've got a DIY approach for JSON instead of leveraging a third-party dependency as an opt-in way to get better JSON support via something like jq.

I get that jq shouldn't be required as it's intended to be as portable as possible.

Can't you just invoke everything differently @polarathene ?

I thought I was doing individual runs for each port separately, but it seems that I kept the --file functionality and documented if the test breaks (because the json file fails to parse) to individually test each port :man_shrugging:

I do reference this issue stating jq could be run after testssl.sh to check for failures before moving on to parsing the results.

However the question is then: what should be the default?

I would think failure consistency between a single call and batch via --file would be desirable. Unclear if that's the right decision now though since as you mention automated monitoring probably shouldn't break the current expected behaviour of all tests being run.

Terminating early is only useful for automated tests like in CI? (since for PRs we need to wait on results and it takes a few minutes, for some setups CI minutes might be billable too). I would vote for a flag to explicitly request to fail fast (--fail-fast / --fail-early is probably a better name than earlier suggestions).

Merged: https://github.com/drwetter/testssl.sh/pull/1871#issuecomment-1868530921

Not clear if there was an option / flag to provide for that logic. If given some context on using that to test against this issue, I'll give a try 👍

option/flag: nope. Other than that: @dcooper16 can better answer than myself.

Their earlier comment above ends with:

The only alternative I can think of would be to stop scanning entirely. testssl.sh already does this in the case that the command line contains an error.

_Do you think we should change this so that any problem caught in a child process by parse_cmd_line(), json_header(), csv_header(), html_header(), or prepare_logging() that results in fatal() being called also results in a signal being sent to the parent for the parent to quit with a $ERR_CHILD exit code?_

Which seems to be what the PR addressed?

I can understand none of us wanting to wade through the entire discussion above 😆

Since no one else has chimed in with this problem and I have had a workaround implemented since reporting, it's not worth your valuable time.

I'll leave the issue open in the mean-time for visibility, in case anyone else experiences it.

drwetter / testssl.sh

[BUG] Command with `--file` considered successful even when all tests fail, JSON error appended malforms data #1844