brenhinkeller / StaticTools.jl

Enabling StaticCompiler.jl-based compilation of (some) Julia code to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things!
MIT License
168 stars 12 forks source link

Iteration issue with StaticString #39

Closed tshort closed 1 year ago

tshort commented 1 year ago

I'm trying to get the following code to work:

using CodeInfoTools
using MacroTools
using StaticTools

swap(e) = e

function swap(e::Expr)
    new = MacroTools.postwalk(e) do s
        @show s
        s isa String && return StaticTools.StaticString(tuple(codeunits(s)..., 0x00))
        return s
    end
    return new
end

function transform(src)
    b = CodeInfoTools.Builder(src)
    for (v, st) in b
        b[v] = swap(st)
    end
    return CodeInfoTools.finish(b)
end

function stringfun(s1, s2)
    return s1 * s2
end

function teststring()
    return stringfun("ab", "c") == "abc"
end

ci = code_info(teststring)
cit = transform(ci)

On StaticTools v0.8.5 (latest), it gets stuck in an infinite loop. On StaticTools v0.8.3, it fails and complains about iteration. It works if I define the following. Defining that on v0.8.5 doesn't work.

String(s::StaticString) = Base.unsafe_string(pointer(s))

Any ideas?

brenhinkeller commented 1 year ago

Oh interesting.. One thing that did change between v0.8.3 and now is that we added iteration support for StaticStrings (https://github.com/brenhinkeller/StaticTools.jl/pull/36) since that is neccesary for things like startswith and endswith but not sure why that would be causing problems here.. the MacroTools stuff is approximately black magic to me though 😆

This part seems to be fine, in any case

julia> using StaticTools

julia> s = "ab"
"ab"

julia> StaticTools.StaticString(tuple(codeunits(s)..., 0x00))
c"ab"
tshort commented 1 year ago

This code works if I use StaticStrings.StaticString. That package also defines iteration and defines String.

The MacroTools stuff is also a bit deep for me.

brenhinkeller commented 1 year ago

Ah, so we did have to change the iteration interface a bit for StaticStrings vsBase.Strings, because the default iteration interface for Base.String is (!?!) explicitly type-unstable (returns nothing once you get to the end of the string), and therefore non-static-compileable. I suppose that could cause some problem if something here is expecting them to follow the same iteration protocol as base Strings, but it's also not so clear to me why they're getting iterated at all here?

No idea if it'll help, but you might try using the conversion methods here https://github.com/brenhinkeller/StaticTools.jl/commit/5fb68d71d03b700be61d27db94dd03eebf8984da on StaticTools#main (can do just StaticString(s) instead of StaticString(tuple(codeunits(s)..., 0x00)))

brenhinkeller commented 1 year ago

I guess if it was able to find some sort of working fallback before we defined iterate, we could possibly also just rename our iteration protocol stableiterate or something?

tshort commented 1 year ago

Here's a shorter reproducer:

String(c"abc")

The nonstandard iteration protocol is the problem. It never finishes. It's trying to run this code, and the loop never finishes.

So, I think we need to define String and remove this iteration protocol.

I also wonder if the standard iteration with nothing could be compiled statically. Can Julia's small-union features take care of the type stability?

brenhinkeller commented 1 year ago

Sadly no -- we only added this type-stable iterate after finding that the Base version could not be made to compile.

There might be some cases where if it's statically knowable / constprop'd that you never hit the end of the string, then you never get the instability -- but more generally things like the union splitting AFAIU only improve performance, and don't actually eliminate the instability

Ah, so it looks like all we actually need to do here is beat Base.String(s::AbstractString) in the dispatch hierarchy, so e.g.

julia> @inline Base.String(s::StaticString{N}) where {N}  = Base.unsafe_string(pointer(s))

julia> String(c"abc")
"abc"

seems to work. But I'll rename the iteration procedure to be safe anyways.

tshort commented 1 year ago

I tried statically compiling a function using iteration with StaticStrings, and it seems to work. It uses nothing-style iteration. This seems like an example where this iteration is statically compilable.

using StaticCompiler
using StaticStrings

function teststring(s)
    res = 0
    for c in s
        res += Int(c)
    end
    return res
end

_, path = compile(teststring, (StaticString{20},))
@show load_function(path)(static"jkjk"20)
brenhinkeller commented 1 year ago

Can you compile_executable or compile_shlib with it?

brenhinkeller commented 1 year ago

If you can, then perhaps it's worth adding that here, even if we couldn't get startswith and friends to statically compile with the type-unstable iterate

tshort commented 1 year ago

I got the following to work. I'm not sure if it covers all iteration scenarios.

using StaticCompiler
using StaticStrings
using Libdl

function stringfun(i)
    for c in static"qwerty"20
        i += Int(c)
    end
    return i
end

name = repr(stringfun)
filepath = compile_shlib(stringfun, (Int,), "./", name)

# Open dylib
ptr = Libdl.dlopen(filepath, Libdl.RTLD_LOCAL)
fptr = Libdl.dlsym(ptr, "julia_$name")
ccall(fptr, Int, (Int,), 10)
brenhinkeller commented 1 year ago

Hmm, ok that might be enough of a reason to support type-unstable iteration -- though unless we missed something in https://github.com/brenhinkeller/StaticTools.jl/issues/35 we weren't able to get it to compile for startswith