Fixed-width string types for facilitating certain string workflows in Julia
The package is registered in the General registry and so can be installed with Pkg.add
.
julia> using Pkg; Pkg.add("InlineStrings")
The package is tested against the latest Julia 1.x
release, the long-term support release 1.6
, and nightly
on Linux, OS X, and Windows.
Contributions are very welcome, as are feature requests and suggestions. Please open an issue if you encounter any problems or would just like to ask a question.
InlineString
A set of custom string types of various fixed sizes. Each inline string is a custom primitive type and can benefit from being stack friendly by avoiding allocations/heap tracking in the GC. When used in an array, the elements are able to be stored inline since each one has a fixed size. Currently support inline strings from 1 byte up to 255 bytes.
The following types are supported: String1
, String3
, String7
, String15
,
String31
, String63
, String127
, String255
.
InlineStrings can be constructed by passing an AbstractString
, byte vector
(AbstractVector{UInt8}
) and position and length, or a pointer (Ptr{UInt8}
)
and length. See the docstring for any individual InlineString type for more details
(like ?String3
).
To generically turn any string into the smallest possible InlineString type,
call InlineString(x)
. To convert any iterator/array of strings to a single,
promoted InlineString
type, the inlinestrings(A)
utility function is exported.
An inline string is value equal (==
), but not identical equal (===
) to the corresponding String
:
julia> i = String15("abc")
"abc"
julia> s = "abc"
"abc"
julia> i == s
true
julia> i === s
false
The hash value of an inline string is the same as it's String
counterpart. Continuing the above example:
julia> Dict(s => 1, i => 2)
Dict{AbstractString, Int64} with 1 entry:
String15("abc") => 2
julia> y[i], y[s]
(2, 2)
julia> Dict(i => 1, i => 2)
Dict{String15, Int64} with 1 entry:
"abc" => 2
Note how the Dict
is passed two key-value pairs but the result contains only one.
This is because we've only given the dictionary a single key (Dict
s use the object's hash as it's underlying key), since i
and s
implement the same hash:
julia> hash(i)
0x7690a66f302b3f70
julia> hash(s)
0x7690a66f302b3f70
Use the REPL's help mode to see more details, such as those desribed here for String15
:
help?> String15
String15(str::AbstractString)
String15(bytes::AbstractVector{UInt8}, pos, len)
String15(ptr::Ptr{UInt8}, [len])
Custom fixed-size string with a fixed size of 16 bytes. 1 byte is used to store the length of the string. If an inline
string is shorter than 15 bytes, the entire string still occupies the full 16 bytes since they are, by definition,
fixed size. Otherwise, they can be treated just like normal String values. Note that sizeof(x) will return the # of
codeunits in an String15 like String, not the total fixed size. For the fixed size, call sizeof(String15). String15 can
be constructed from an existing String (String15(x::AbstractString)), from a byte buffer with position and length
(String15(buf, pos, len)), from a pointer with optional length (String15(ptr, len)) or built iteratively by starting
with x = String15() and calling x, overflowed = InlineStrings.addcodeunit(x, b::UInt8) which returns a new String15
with the new codeunit b appended and an overflowed Bool value indicating whether too many codeunits have been appended
for the fixed size. When constructed from a pointer, note that the ptr must point to valid memory or program data may
become corrupt. If the len argument is specified with the pointer, it must fit within the fixed size of String15; if no
length is provided, the C-string is assumed to be NUL-terminated. If the NUL-terminated string ends up longer than can
fit in String15, an ArgumentError will be thrown.