tshort / StaticCompiler.jl

Compiles Julia code to a standalone library (experimental)
Other
488 stars 31 forks source link

Question on use on Windows; with or without Python #137

Closed PallHaraldsson closed 4 months ago

PallHaraldsson commented 11 months ago

I suppose you do not support Windows because you produce ELF (but could produce binaries easily for Windows; ELF, or whatever you do, works for macOS too?).

I'm thinking since you document using compiled libraries with Python, and Python itself works an Windows, it could work, but I still doubt that, because the libraries would be Windows specific. Are libraries ELF, is that the problem and/or about calling conversion?

You make it out as compiled code should work because of same C layout, but that's only at the lowest level. I think you would also have a problem for strings (that you don't support anyway, only some substitute, but those would also have a problem), until Python supports UTF-8 fully (in 3.15?). I know you can call back and forth with PythonCall.jl and it take care of converting strings back and forth, I guess. But you can't use with that; or reuse its string code? You would also have a problem with Dicts, not having same memory layout in Python (besides being ordered, by now, by default in Python, I think similar to DefaultOrderedDict, conceptually, though not same layout). You would also not be able to use Dicts (or Sets) anyway since they rely on (automatic) memory allocation? Nor could you use Python's dicts? I'm thinking what of Julia's basic data structures you could use, and it would be basically only (only 1D?) arrays, and scalar types, except for Float16, with Python?

All the issues I list would also apply for WebAssembly, that I see you doc now? At least some issues with strings?

brenhinkeller commented 11 months ago

The actual binaries are made using Clang_jll, so should be windows-format on windows. However, last time we tried things on windows CI things didn’t seem to work, and I’m not sure if any of the devs have a windows machine (I don’t). PRs welcome if you can get things working on Windows (or if it’s suddenly started working).

tshort commented 11 months ago

As @brenhinkeller said, getting Windows to work is mainly about the compiler/linker. compile_wasm does appear to work on Windows. compile_shlib and compile_executable get as far as creating an object file. They fail on linking to a dll or exe.

PackageCompiler solves linking on Windows by including the mingw compiler as an artifact (see here).

Thomas008 commented 7 months ago

We have a strong need to get StaticCompiler work on Windows. We tried compile_executable: using StaticCompiler hello() = println(c"Hello, world!") compile_executable(hello, (), "./")

It created the object file, as mentioned. Clang failed then, obviously when linking, using ld. When the clang.exe from the artifact is used, then the error messages come:

"ld" -m i386pep -Bdynamic -o ./hello.exe crt2.o crtbegin.o "-LC:\Users\T460\.julia\artifacts\0d4e66a7641d78b6976a9b68ac8c96b98ef4d586\x86_64-w64-mingw32\lib" "-LC:\Users\T460\.julia\artifacts\0d4e66a7641d78b6976a9b68ac8c96b98ef4d586\x86_64-w64-mingw32\mingw/lib" "-LC:\Users\T460\.julia\artifacts\0d4e66a7641d78b6976a9b68ac8c96b98ef4d586\lib" "-LC:\Users\T460\.julia\artifacts\0d4e66a7641d78b6976a9b68ac8c96b98ef4d586\x86_64-w64-mingw32/sys-root/mingw/lib" "-LC:\Users\T460\.julia\artifacts\0d4e66a7641d78b6976a9b68ac8c96b98ef4d586\lib\clang\15.0.7\lib\windows" "C:\Users\T460\AppData\Local\Temp\wrapper-e6eb70.o" ./hello.o -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt -ladvapi32 -lshell32 -luser32 -lkernel32 -lmingw32 -lgcc -lgcc_eh -lmoldname -lmingwex -lmsvcrt -lkernel32 crtend.o clang: error: unable to execute command: program not executable clang: error: linker command failed with exit code 1 (use -v to see invocation)

Then we tried clang, installed via Visual Studio. This prompted the error message, that the object file (hello.o) is being damaged or invalid, and the reading at a certain address is not possible:

clang -c wrapper.c clang -v -o tes.exe wrapper.o hello.o clang version 17.0.1 Target: x86_64-pc-windows-msvc Thread model: posix InstalledDir: C:\Program Files\LLVM\bin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\Hostx64\x64\link.exe" -out:tes.exe -defaultlib:libcmt -defaultlib:oldnames "-libpath:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\lib\x64" "-libpath:C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\atlmfc\lib\x64" "-libpath:C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\ucrt\x64" "-libpath:C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22000.0\um\x64" "-libpath:C:\Program Files\LLVM\lib\clang\17\lib\windows" -nologo wrapper.o hello.o hello.o : fatal error LNK1107: Invalid or corrupt file: reading in 0x3F8 not possible.

We would be happy, if you had any further suggestions.

tshort commented 7 months ago

You might want to try the compiler that PackageCompiler installs as an artifact. See https://github.com/JuliaLang/PackageCompiler.jl/blob/master/Artifacts.toml.

MasonProtter commented 7 months ago

And if you figure out how to get it working, please file a PR.

PallHaraldsson commented 7 months ago

We have a strong need to get StaticCompiler work on Windows.

Then please file a PR, if you get it to work on Windows.

[I hope you know you can compile (Windows) apps with PackageCompiler.jl, already, with all features available; and that while the binaries are large, than can be made smaller with optionally. E.g. dropping LLVM, and OpenBLAS.]

However note, you can also compile to Linux app (or library) with this as is, and run in Windows, i.e. under WSL2. Then it's strictly speaking a Linux app, running under the Linux kernel in Windows, but it may not matter.

So what are you doing, making an app, or a library? If a library, I think you can also start Python on Windows and use a Julia compiled library from here (or was it from another project?).

See also: https://github.com/tshort/StaticCompiler.jl/issues/119#issuecomment-1805586796

Another option is WebAssemblyCompiler.jl that has fewer limitations, has GC, at least with Chrome v119. WebAssembly is not just for web browser, it's original goal, can also be run for regular apps, e.g. in Windows.

Microsoft has some docs on it, so is supporting it in some way, but also doesn't need to, it works without their help:

https://developer.mozilla.org/en-US/docs/WebAssembly/C_to_Wasm

MasonProtter commented 7 months ago

See also: #119 (comment)

They're using compile_executable, not compile.

Thomas008 commented 7 months ago

Thank you! We have now applied the compiler that PackageCompiler installed as artifact. (It seems to be gcc (not clang) from mingw 64). It happily produced an executable file without error. Unfortunately no output ("hello world") is being printed. The function is hello() = println(c"Hello, world!") Do you have any experience or suggestions to get further?

Thomas008 commented 7 months ago

We used the compiler in https://github.com/JuliaLang/PackageCompiler.jl/releases/download/v1.0.0/x86_64-8.1.0-release-posix-seh-rt_v6-rev0.tar.gz as occurring in Artifacts.toml von PackageCompiler for x86_64 that you suggested: https://github.com/JuliaLang/PackageCompiler.jl/blob/master/Artifacts.toml

brenhinkeller commented 7 months ago

Have you tried flushing stdout? Sometimes things seem to get stuck in the queue

brenhinkeller commented 7 months ago

The other possibility may be that printing needs special handling on windows -- since that looks like a StaticTools.jl StaticString, println will dispatch to this implementation of puts under the hood: https://github.com/brenhinkeller/StaticTools.jl/blob/3b81edfc8bda58f6aa5e8ec2e4c099cd918b89a7/src/llvmio.jl#L576-L591

That implementation seems to work ok across Mac and Linux, but as I don't have any Windows machines I've not tried on Windows. If that seems to be the problem, PRs welcome on StaticTools as well!

Thomas008 commented 7 months ago

We tried flush(stdout) and puts as you suggested, but unfortunately we got no output. When looking at the error status with echo $? we get False. Nevertheless gcc in the artifact installed by PackageCompiler seems to be the right way: When using a primitive function that just returns a number (e.g. hello() = 2) we get the status True. And when compiling a time-consuming calculation, the generated executable program takes time as well. For example,

`using StaticCompiler, StaticTools

function calc(argc::Int, argv::Ptr{Ptr{UInt8}}) num = argparse(Int64, argv, 2)
b = 0 for i in 1:num b =b + i*i + i^2 / i^3 + i^4 end 0 end compile_executable(calc, (Int64, Ptr{Ptr{UInt8}}), "./")`

Wenn calling calc <number> then the executable takes more time the higher the given number.

So we conclude gcc from the PackageCompiler works. We only need the possibility to print, to get an output. Do you have any ideas how to do that?

tshort commented 7 months ago

Do you have a repo to play with? Also, does printing StaticStrings work at the REPL on windows?

MasonProtter commented 7 months ago

Aren't strings on Windows different from Unix descended operating systems? Something about them not being null terminated if I recall correctly. So I'd think string operations would need some adjustment in StaticTools to accommodate windows.

PallHaraldsson commented 7 months ago

There are many types of Windows apps, and they are encoded in the .EXE header.

At least 2-3 to worry about. Typical GUI (maybe you're compiling such) and I'm not sure where you can print to, or if stdout is connected to anything at all.

Then console Windows, likely what you want "CUI" if I recall. There's actually also CUI POSIX type, for the POSIX subsystem, which may not apply.

One other thing, Windows uses wide strings by default, i.e. UTF-16, not UTF-8, but UTF-8 is also fully supported, with the older "ANSI" API, and then it doesn't mean ANSI anymore. But it likely needs to be configured and I'm not sure how. ANSI was for different 8-bit codepages, now for that, and UTF-8 which has the number 65001 in Windows (also CP_UTF8, likely enum), on Xbox, only UTF-8 is supported by the ANSI API, which is one more Windows type of program (would be interesting to know if could be supported...). I still think it would work for all types, not sure.

[If strings are not 0-terminated in Windows, is that for sure, then you need to have another way to state how long...]

[Then there is Windows WSL2 and it would run typical Linux binaries, i.e. ELF, not needing .EXE ending. That's also POSIX, I think not confused with what I mentioned above, nor do I know when you would use that, but you could try. There are also other types e.g. OS/2 subsystem... You could try more than one to see if anything works... though its very unlikely to support UTF-8.]

PallHaraldsson commented 7 months ago

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

Win32 APIs often support both -A and -W variants.

The ANSI -A API should be "often" supported, i.e. for UTF-8, at least for most common operations, such as printing (and getting input). I think that's what they mean, i.e. -W should always be supported if you rather want to use that, then no setup needed, only conversion:

Because Windows operates natively in UTF-16 (WCHAR), you might need to convert UTF-8 data to UTF-16 (or vice versa) to interoperate with Windows APIs.

MultiByteToWideChar and WideCharToMultiByte let you convert between UTF-8 and UTF-16 (WCHAR) (and other code pages). This is particularly useful when a legacy Win32 API might only understand WCHAR.

https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar

[in] CodePage

Note For the code page 65001 (UTF-8) or the code page 54936 (GB18030, Windows Vista and later), dwFlags must be set to either 0 or WC_ERR_INVALID_CHARS. Otherwise, the function fails with ERROR_INVALID_FLAGS.

[in] cchWideChar

Size, in characters, of the string indicated by lpWideCharStr. Alternatively, this parameter can be set to -1 if the string is null-terminated. If cchWideChar is set to 0, the function fails.

If this parameter is -1, the function processes the entire input string, including the terminating null character. Therefore, the resulting character string has a terminating null character, and the length returned by the function includes this character.

This way isn't for sure better since you need:

[out, optional] lpMultiByteStr

Pointer to a buffer that receives the converted string.

I.e. it seems to me you must know in advance the maximum size (times 4 four bytes, not times 2 because of possible surrogates) for UTF-16, so in effect you need to scan the string to get the length in bytes (or chars) and malloc the buffer, and I suppose free it, or use C++ to do it for you.

So it's possibly faster to use -A ANSI, since either Windows uses UTF-8 directly, and doesn't translate itself, likely for at least some API, since faster, but I'm not sure it does, it might just convert implicitly for you, each time you call the API (and clean up after), even for all APIs, I'm not sure.

Then the conversion needs to happen, but if you call e.g. print repeatedly, it might be faster to do the conversion yourself and use -W API, to not have to do redundant conversion. Then also all API will work. Then you might also consider just using UTF-16, but note:

http://utf8everywhere.org/

and Microsoft agrees and now recommends UTF-8 use and: https://learn.microsoft.com/en-us/gaming/gdk/_content/gc/system/overviews/utf-8

"Windows [is] moving forward to support UTF-8 to remove this unique burden [resulting] in fewer internationalization issues in apps and games"

https://en.wikipedia.org/wiki/Unicode_in_Microsoft_Windows

[The Chinese GB18030 Unicode Transformation Format is actually intriguing, since it's ALWAYS more space-efficient than UTF16, for any language and character, not just Chinese, so UTF-16 was just a historical accident. It, like UTF-8, includes ASCII as a subset, and almost always matches UTF-8 in size, and is always better for East Asian languages.]

Thomas008 commented 7 months ago

@tshort: Printing StaticStrings at the REPL on windows seems to work println(c"Hello, world!") =>

Hello, world!
0

So far we have no repo. But it could be useful. Thank you for the suggestions. We try to figure them out.

Thomas008 commented 7 months ago

From what I have read, Windows does not seem to be restricted to prefixed length strings. At least e.g. the CRT routines "operate on null-terminated single-byte character, wide-character, and multibyte-character strings" https://learn.microsoft.com/en-us/cpp/c-runtime-library/string-manipulation-crt?view=msvc-170

When testing StaticTools in Windows, stdoutp(), stderrp(), stdinp() do not work in the REPL on Windows, but putchar(), newline(), getchar(), and all the higher-level things do. When compiling hello() = putchar('h') or just newline(), then the executable puts just a "smilie"-character, probably indicating that some pointer, type, or sth. got wrong. When compiling hello()=getchar(), then the executable expects an input. So that worked. The question remains, how to put a string on the screen.

As @brenhinkeller mentioned, the implementation of puts in llvmio.jl in StaticTools.jl could be adapted for Windows.

Accomodating string representation and their output in StringTools to Windows would be very beneficial and great. What would be necessary to do that?

tshort commented 7 months ago

If you publish a repo with what you have, it'll give others a starting point to try stuff.

For the executable you have, you might want to try (1) disassemble it to see what it's doing, (2) run it in a debugger to see what it's doing, and/or (3) disassemble a hello world program written in another language to see what it's doing. One thing to look for is if/how it's calling the C standard library for putchar or other function.

Thomas008 commented 6 months ago

@tshort: What should a repo contain, such that it serves best other people as a starting point?

tshort commented 6 months ago

The easiest repo is a fork of StaticCompiler with whatever extras you added to make Windows work better.

Thomas008 commented 5 months ago

We are still working on adapting StaticCompiler to Windows. We hope to make such a repo ready for testing soon.

Thomas008 commented 5 months ago

We used llc to get a native object file (instead of using GPUCompiler), and now it works fine for Windows. I imagine there are some reasons why StaticCompiler does not use llc?

tshort commented 5 months ago

If llc works, I don't see a problem of switching to that. The existing code uses clang, but I thought that used llc behind the scenes, so I'm confused as to why straight llc works. Did you need special arguments to llc?

Also, where does the llc you use come from?

Thomas008 commented 5 months ago

Yes, of course you're right: meanwhile llc is being integrated into clang. clang can generate an executable from a LLVM IR (ll-file) (without necessarily generating an o-file). So the command clang -Wno-override-module $wrapper_path $ir_path -o $exec_path) (with ir_path = joinpath(path, "$filename.ll") ingenerate_executable()` suffices, and generates an executable (without saving an .o-file). On my windows system clang gives a warning stating that the target triple was overrided, but the generated executable works fine.

When using llc, I had to replace the target triple: llc -filetype=obj -mtriple=x86_64-w64-windows-gnu $ir_path -o $obj_path. It replaced "x86_64-w64-mingw32", but I'm not sure if this is the case to all windows machines. After that, of course, I had to link with clang or gcc as you do in the code: run($cc $wrapper_path $cflags $obj_path -o $exec_path)

PallHaraldsson commented 5 months ago

It seems Windows is solved?!

What I see is done:

cc = Sys.isapple() ? `cc` : clang()
    # Compile!
[..]
        print(f, """int $fn(int argc, char** argv);
        void* __stack_chk_guard = (void*) $(rand(UInt) >> 1);

        int main(int argc, char** argv)
        {
            $fn(argc, argv);
            return 0;
        }""")
        close(f)
        run(`$cc $wrapper_path $cflags $obj_path -o $exec_path`)

I.e. standard C program wrapper (also legal C++? though unimportant detail), and C compiler used (unclear if failing/exceptions still return 0, at least exit(non-zero_value) possible).

On Windows (and Linux) then clang() gives you "clang" (or "cc"? why that [synonym for clang?] rather used, on macOS?) and uses Clang_jll (I suppose, I didn't track down where clang() came from).

I see Clang_jll was upgraded to 17.0.6, 3 days ago, from 16.05, in case it helps. I think one open loose end might be "int argc, char** argv" and some wide-char stuff needed for Windows?

It's great do see this solved, I'm not on Windows so I can't test or make a PR easily, but is only a small "trivial" (or not) PR needed?

For clang I see:

The default C language standard is gnu17, except on PS4, where it is gnu99. [..] The default C++ language standard is gnu++14.

I believe as [clang] invoked you think you get the C compiler, i.e. C17, the gnu variant, and I don't believe it would help to try rather invoking with option for C++, or GNU variant (or not) of either, though maybe worth a try:

You can use Clang in C23 mode with the -std=c23 option (available in Clang 18 and later) or with the -std=c2x option (available in Clang 9 and later).

You can use Clang in C++2c mode with the -std=c++2c option.

[I don't see that 32-bit Windows is solved yet, and not a priority for anyone, it seems Clang_jll only supports 64-bit Windows.]

Thomas008 commented 5 months ago

The previous code generates a native object file with help of GPUCompiler (and the LLVM package). Yes, after that, clang links this object file and wrapper.c All I can say is, that, at least on my Windows machine, the output of strings on stdout or into a file does not work. When using clang to skip GPUCompiler and to generate the executable file straigt away, then putting strings works.

Thomas008 commented 5 months ago

Here is a repo to play with. It contains the mentioned adaptions to Windows. Remark: It uses a clang that is locally installed, because the clang in the artifacts did not work in my installation.

Thomas008 commented 5 months ago

Just to be sure: Dicts do not yet work for StaticCompiler, right?

tshort commented 5 months ago

Correct. Dicts rely on allocation and GC. It may be possible to write your own variation of a Dict using the capabilities from StaticTools (that's an advanced job, I suspect).

Thomas008 commented 5 months ago

Another question (probably again belonging to StaticTools): StackArray or MallocArray is only possible on Int64 or Float64, but not possible on Symbols? E.g. a = StackArray{Symbol}(undef, 3) does not work, right?

brenhinkeller commented 4 months ago

Yes, StackArray and MallocArray only work on types for which Base.allocatedinline is true. You might be able to use a tuple though.