-
1. `utf8code` should reject code points above U+10FFFF (RFC 3629).
2. `String#chr` should reject code points between `0xD800` and `0xDFFF` (RFC 3629).
3. `String#ord` should raise error against ill-fo…
-
[DigitalTrustCenter/sectxt](https://github.com/DigitalTrustCenter/sectxt) released 0.9.0 with has quite a few parser improvements, especially on PGP.
The only one I'm not sure about is the strippin…
-
### Proposal Details
I find myself in need of such a method to determine how many bytes in a UTF-8 string when iterating over bytes. Following [RFC 3629](https://datatracker.ietf.org/doc/html/rfc36…
-
Per ISOBMFF (ISO/IEC 14496-12:2020) § 4.2.1, fields of type `utf8string` are
> UTF-8 string as defined in IETF RFC 3629, null-terminated.
Currently this requirement is not enforced in the parsing …
-
# Deserializing Panic with UTF-8 BOM (Byte Order Mark) Content
I encounter an issue when attempting to deserialize a string encoded in UTF-8 with a Byte Order Mark (BOM). The deserializer throws th…
-
The parsing seems to keep starting over and over.
Perhaps the US-ASCII isn't detected/used correctly.
"US-ASCII is upwards-compatible with UTF-8 (an US-ASCII string is also a UTF-8 string, see [RFC 36…
-
## UTF-8
**UTF-8(8-bit Unicode Transformation Format)** 是一种针对Unicode的可变长度字元编码,也是一种前缀码。它可以用来表示Unicode标准中的任何字元,且其编码中的第一个字节仍与ASCII兼容,这使得原来处理ASCII字元的软件无须或只须做少部分修改,即可继续使用。
UTF-8使用一至六个字节为每个字符编码(尽管如此,2…
-
* I use a `read_file` block to print a file whose contents may contain non-ASCII characters encoded in UTF-8.
* I set `Max_characters` to 120.
* It looks like `print_file_contents()` doesn't attempt…
-
**Describe the bug**
Hi, I found something on the HiveMQ that is contrary to the protocol specification description (protocol violation or logic bug).
For tracking purposes, I will report all result…
-
`keyword_referencegroup` `type_defect` | by lbartholomew@amsl.com
___
A section-number citation for a reference entry that uses "`` yields this error:
rfc9000-referencegroup-issue.xml(…