But sometimes, we need the length of the string in order to output the string (most notably, Pascal/length-prefixed strings). I
personally came across this issue while trying to write a Lua bytecode ruledef (yes, I know I'm a weirdo); Lua strings are variable-int length-prefixed strings, so a 126-character string (excessive, but entirely possible to have in your program) would be encoded as 0xff followed by 126 characters, while a 127-character string would be encoded as 0x01 0x80 followed by 127 characters.
This usecase can't use "a bit of arithmetic", since by the time customasm has emitted the string we're already too late to do anything with that length. Using an asm block and the whole "you can refer to a variable before it exists" thing doesn't work either, since customasm just chokes on not being able to find the variable's value. You can use the "bit of arithmetic" outside of a ruledef, but then it just looks sloppy and is more difficult to use (see below):
#ruledef {
size {num} => {
assert(num>=0)
assert(num<=0x7f)
0b1 @ (num+1)`7
}
; presumably other definitions of size {num} for larger sizes
}
; you have to do this every time you want to emit a string
; (you'd have to do something similar to emit a string containing binary data,
; since strings are stored as String on the backend and not OsString but that's for another time)
size len ; 87
old = $
#d "=stdin"
len = $-old
The solution I came up with is to add a builtin strlen function, which just returns the string's length (in bytes) as an integer. This solves the previous usecase, as I can simply do the following (compare the above codeblock):
#ruledef
{
size {num} => {
assert(num>0)
assert(num<=0x7e)
0b1 @ (num+1)`7
}
; presumably other definitions of size {num} for larger sizes
str {x} => asm {
size strlen({x})
} @ x
}
str "=stdin" ; 87 "=stdin"
This looks good! In the future, I'd even like to go even further, and add some functionality to get the bit-size of any kind of value (#95), or the data pointed to by a label (as in #167).
According to the docs:
But sometimes, we need the length of the string in order to output the string (most notably, Pascal/length-prefixed strings). I personally came across this issue while trying to write a Lua bytecode ruledef (yes, I know I'm a weirdo); Lua strings are variable-int length-prefixed strings, so a 126-character string (excessive, but entirely possible to have in your program) would be encoded as 0xff followed by 126 characters, while a 127-character string would be encoded as 0x01 0x80 followed by 127 characters.
This usecase can't use "a bit of arithmetic", since by the time customasm has emitted the string we're already too late to do anything with that length. Using an asm block and the whole "you can refer to a variable before it exists" thing doesn't work either, since customasm just chokes on not being able to find the variable's value. You can use the "bit of arithmetic" outside of a ruledef, but then it just looks sloppy and is more difficult to use (see below):
The solution I came up with is to add a builtin
strlen
function, which just returns the string's length (in bytes) as an integer. This solves the previous usecase, as I can simply do the following (compare the above codeblock):