jstemmer / go-junit-report

Convert Go test output to JUnit XML
MIT License
776 stars 224 forks source link

Need proper handling for invalid character for CDATA field #138

Open firodj opened 2 years ago

firodj commented 2 years ago

When output tests contains colored output log that may include escape character (0x1b) on test output report, the escape character is not treated well by encoding/xml package for CDATA section, so it will generate invalid XML document (as spec), that may result the test report being unreadable by the CI (in my case GitLab CI).

Related issue: https://github.com/golang/go/issues/53728

jstemmer commented 2 years ago

Thanks. It sounds like we should at least strip the invalid characters from the CDATA section. However, since ANSI escape sequences also contain regular characters we'd probably want to strip the entire ANSI sequence rather than just the (invalid) control characters.

emtammaru commented 2 years ago

I just ran into the same issue with Jenkins junit parser:

org.dom4j.DocumentException: Error on line 5 of document  : An invalid XML character (Unicode: 0x1b) was found in the CDATA section.
janisz commented 2 years ago

Same here, I've tried to update to v2 https://github.com/stackrox/stackrox/pull/2586 and prow does not parse XML as it's not valid.

error on line 2291 at column 25: Input is not proper UTF-8, indicate encoding !
Bytes: 0x00 0x63 0x32 0x68
jstemmer commented 2 years ago

@janisz Where do these invalid bytes come from, this doesn't look like an ansi escape character. Any concerns if they are just dropped completely from the output?

janisz commented 2 years ago

This bytes came from key that is logged. Key is generated as prefix + separator + id where separator is []byte("\x00") which is a control character. I found a workaround of this and print the key with %q instead of %s. From my perspective it's save to ignore them or escape. https://github.com/stackrox/stackrox/blob/1c51bfebc23b2748bc103a583b5460611dfcad53/pkg/dbhelper/db.go#L10

Minimal example:

package main

import "testing"

func TestEscape(_ *testing.T) {
    println("Test '\x00'")
}
 $ go test ./... -v  2>&1 | go-junit-report -set-exit-code | file -bi -
+ Actual  : application/octet-stream; charset=binary
- Expected: text/xml; charset=us-ascii
janisz commented 2 years ago

Created a PR to drop control characters from output https://github.com/jstemmer/go-junit-report/pull/140

jstemmer commented 2 years ago

Leaving this issue open for now, because it would be nice if we could also detect ANSI escape sequences and remove them from the output.

sruehl commented 1 year ago

@jstemmer is #162 sufficient?

TafThorne commented 1 year ago

I have run into a similar problem when trying out the newer v2 tool with my GitLab CI pipelines. Before this update the tool was working very well, attempting to run with it today caused GitLab to show: image With the error text listed as: JUnit XML parsing failed: 4106:21: FATAL: CData section not finished 2023/07/20 10:27:5 Looking in the junit-report.xml file that was generated and that is being complained about I find the following lines:

    <testsuite name="gitlab.company.com/some/paths/gomodule" tests="72" failures="0" errors="0" id="82" hostname="runner-bvfrfxhz-project-185-concurrent-0rd9jq" time="0.550" timestamp="2023-07-20T10:29:44Z">
        <testcase name="Test_Buffer_Is_Full" classname="gitlab.company.com/some/paths/gomodule" time="0.000"></testcase>
        <testcase name="Test_Buffer_Processing" classname="gitlab.company.com/some/paths/gomodule" time="0.000"></testcase>
        <testcase name="Test_VariablesCanBeRead" classname="gitlab.company.com/some/paths/gomodule" time="0.020">
            <system-out><![CDATA[Calibration [ERROR] precondition not met
Calibration [ERROR] precondition not met
Calibration [ERROR] precondition not met
Calibration [ERROR] precondition not met
properties [WARN] Warning no property file located]]></system-out>
        </testcase>
        <testcase name="Test_ChangeDbPath" classname="gitlab.company.com/some/paths/gomodule" time="0.020"></testcase>
        <testcase name="Test_InvalidPathCannotBeApplied" classname="gitlab.company.com/some/paths/gomodule" time="0.000">
            <system-out><![CDATA[
2023/07/20 10:27:56 /builds/gitlab.company.com/some/paths/gomodule/persistence.go:53
[error] failed to initialize database, got error unable to open database file: is a directory]]></system-out>
        </testcase>

(company identifying path names altered) It is the second set of CDATA being complained about. As others have said, likely due to the presence of an escape or UNICODE charter.

A cheap work around would be to loose the "system-out" data with a parameter or configuration setting. Stripping or replacing invalid characters would obviously be nicer if possible.

Let me know if I can be of any help in diagnosing or fixing the problem.

janisz commented 1 year ago

@TafThorne what version are you using? I thought special characters will be replaced with #140

sruehl commented 1 year ago

I would assume #140 was not sufficient. Had the same problem and needed to use the fork / the PR #162 to fix it.

TafThorne commented 1 year ago

@TafThorne what version are you using? I thought special characters will be replaced with #140

I ran with the latest v2 tagged code I think:

go install github.com/jstemmer/go-junit-report/v2@latest && go install github.com/t-yuki/gocover-cobertura@latest
go: downloading github.com/jstemmer/go-junit-report/v2 v2.0.0
go: downloading github.com/jstemmer/go-junit-report v1.0.0
go: downloading github.com/t-yuki/gocover-cobertura v0.0.0-20180217150009-aaee18c8195c

So that would be v2.0.0 https://github.com/jstemmer/go-junit-report/commit/7fde4641acef5b92f397a8baf8309d1a45d608cc which is from Jul 1, 2022. I can try to repeat with the latest code on master and see if it fixes it... one second....

TafThorne commented 1 year ago

It was more than a second but I can confirm that by using the latest changes on main things worked correctly for me.

go install github.com/jstemmer/go-junit-report/v2@7933520f1e28509e5ba6e95a6c3157d53c223561 && go install github.com/t-yuki/gocover-cobertura@latest
go: downloading github.com/jstemmer/go-junit-report/v2 v2.0.1-0.20230321231245-7933520f1e28
go: downloading github.com/t-yuki/gocover-cobertura v0.0.0-20180217150009-aaee18c8195c

The above are the details of the changeset that worked for me.

Is there a plan to release some kind of 2.0.1 set of bug fixes or similar in the near future or should I wait for the 2.1.0 release I can see pencilled in?