Closed cejkato2 closed 1 year ago
Base: 80.00% // Head: 80.00% // No change to project coverage :thumbsup:
Coverage data is based on head (
d5bc3e0
) compared to base (64f1e6d
). Patch has no changes to coverable lines.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Python string fails when non-utf8 bytes occure in the field value due to decoding error. This patch replaces straightforward use of PyUnicode_FromStringAndSize() with PyUnicode_DecodeUTF8(), which can be set to replace invalid bytes.
As a result, the invalid non-utf byte sequences are replaced by 0xFFFD character (dec ~ 65533).
It is worth noting that the replacement process affects performance (quite significantly). On the other hand, valid UTF-8 strings perform similarly as before this PR.