iver-wharf / wharf-api

Wharf backend written in Go
MIT License
1 stars 0 forks source link

Unescape TestResultDetail messages #133

Open Alexamakans opened 2 years ago

Alexamakans commented 2 years ago

Created as per discussion over at https://github.com/iver-wharf/wharf-web/pull/53#discussion_r794295841

The messages taken from .TRX files are currently escaped. This is slightly annoying as we have to unescape them when we want to display them in wharf-web.

Initial thoughts from me is that unescaping before storing in DB seems okay, I don't know if we would ever want to have the escaped version. Also likely that if we do need it, it wouldn't be a big hassle to either re-escape the existing data on request/migration, or let unescaped messages stay unescaped.

Alternatives

applejag commented 2 years ago

Question is if the escaped data is due to incorrect XML parsing or if it's double-escaped in the test results. If the latter, then is that part of the TRX file format, or just something that NUnit does?

Alexamakans commented 2 years ago

The files themselves have the messages escaped.

I believe it is because the XML standard always treats its special charset as special with no regard to surrounding text, and TRX follows it, but haven't fact checked this very thoroughly.

applejag commented 2 years ago

XML thankfully has a much smaller entity set than HTML (https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML), but it's meant to be parsed as-if they were unescaped. Such as the following:

<MyValue>1 &lt; &apos;0&apos;</MyValue>

Should be parsed as the equivalent JSON:

{
  "MyValue": "1 < '0'"
}

Similarly, the following XML:

<MyValue text="1 &lt; &apos;0&apos;"/>

Should be parsed as the equivalent JSON:

{
  "MyValue": {
    "text": "1 < '0'"
  }
}

It does depend on the parser and how it's configured. I haven't looked into it deep enough, but my guess is that the encoding/xml has some Go tag we could use/should not be using to get this right.