Open UWN opened 2 years ago
Another example that currently occurs in library(crypto)
and library(charsio)
is:
byte_char
I use it to denote a character whose code is in 0..255. It is like char
, except that it raises a domain error if the code of the character is greater than 255. This is useful when using strings to compactly represent octet sequences in memory. The internal predicate '$first_non_octet'/2
can be used to efficiently locate the first "non-octet" in strings. Maybe this could be a potential candidate for inclusion in library(error)
? For example, as:
must_be(single_octet_chars, Cs)
How are lists of single octet characters represented in memory? If chars is utf8, then any char value between 128-255 would be represented with two bytes. Is there a special-cased octet-list representation (u8 vec) akin to the char-list representation (utf8 string I assume)?
@infogulch: The internal representation is UTF-8, so indeed the characters with codes in 128-255 are represented by 2 bytes each!
Being 'slightly inefficient' (1.5 bytes per 'octet char' on average?) isn't much of an issue for general byte manipulation, especially compared to other representations (24+ bytes per element, oof). But for cryptography in particular, I'm concerned that using a nonlinear representation could expose the plaintext and intermediates to side channel attacks, maybe leaking one bit per octet (the high bit). Has this potential issue been considered already?
When encrypting binary data by using the encoding(octet)
option of library(crypto)
, the characters are first transformed to actual bytes (u8
), all in the range 0..255:
It seems this issue went a little bit into some side track. Any other types?
not_less_than_zero
is made available as part of #1593!
Currently
must_be/2
supports some types of 7.12.2 b and some informal aschars
. Further candidates would be those in 8.1.2.1 and in generaldomain_errors
of 7.12.2 c. This would help to make errors more uniform in particular the different reporting forlist
andcharacter
for chars and the like.The following have occurred so far:
in_character
(type error)not_less_than_zero
(type_error(integer, I) and domain_error)