Closed thp-canonical closed 2 months ago
Won't this cause problems when we later read that file to put into the json payload to send to the server? IIRC json (at least in python) doesn't like having binary data in the fields. At the very least, we'll need to be careful how we read the log and convert it or do something with it when we read it in agent/testflinger_agent/job.py
Even if you want to read the file later on, reading the file as a whole and treating it as UTF-8 (and ignoring or escaping/replacing special characters then) might have less opportunities for something going wrong there, this has now been implemented in https://github.com/canonical/testflinger/pull/341/commits/efd8a18e3a8c43977bbae633ae44066b0ab06820 (edited; Black) as part of this PR.
@thp-canonical because of branch protections, you'll need to sign the commits before this can be merged.
@thp-canonical because of branch protections, you'll need to sign the commits before this can be merged.
Whoops, sorry. Done now (and squashed the changes into a single commit as part of it).
Description
By opening the serial log file in binary ("b") mode, we can write the binary data received from the TCP socket directly to the file without having to (try to) decode the content as UTF-8. This makes serial logging 8-bit clean.
This is especially important if a character in an otherwise valid UTF-8 string happens to cross the boundary of a 4096-byte read, resulting in lost data instead of a single, valid UTF-8 character in the file:
The same is true for binary data (basically bytes with values 127-255) that can be passed through as-is without being interpreted as UTF-8, e.g:
And - as a side-effect of the previous one - log output encoded in non-UTF8 encodings will have their non-ASCII characters stripped:
In all 3 cases, since we use
"ignore"
, bytes that are not a valid UTF-8 sequence will just be silently thrown away.Resolved issues
This resolves issue with:
Documentation
No changes.
Web service API changes
No changes.
Tests
I have not tested this.