Open Jianqoq opened 10 months ago
In my ops folder, I used macro to implement operations for my struct with a bunch of different type, from bool to f64. If I remove all the operations and left only around 5 operations like Add, Sub, Div, Shr, etc, it can pass the compilation. However, if I add one more operations, link error will occur.
What did you compile to produce this error? What version of Rust did you use?
"rust-analyzer.linkedProjects": [
".\\utensor_main\\Cargo.toml",
],
"rust-analyzer.runnables.extraEnv": {
"RUSTFLAGS": "-C target-feature=+avx2 -C target-feature=+fma -C incremental=true -C opt-level=0"
},
"rust-analyzer.server.extraEnv": {
"RUSTFLAGS": "-C target-feature=+avx2 -C target-feature=+fma -C incremental=true -C opt-level=0"
},
I was using rust analyzer to compile it with the argument shown above. The version is stable 1.74 but I had also tried 1.76.0 nightly, none of them can compile my program in debug mode.
[package]
name = "Utensor" version = "0.1.0" edition = "2021"
[wordspace] members = [ "utensor-main", "utensor-macros", ]
[dependencies] utensor_macros = { path = "utensor_macros" } utensor_main = { path = "utensor_main" }
[[bin]] name = "utensor" path = "utensor_main/src/main.rs"
It is my workspace cargo.toml, I can see that most of the linking error are about iterator, but I am not sure why release mode can compile instead of debug.
Can you point us to the source code you're compiling? If one of us can compile it there's a much better chance of figuring out what's going on here.
https://github.com/Jianqoq/utensor-rs, this is the source code
I can reproduce the error on Windows. Trying Linux...
Thank you! Hopfully we can get it fix :)
Hm, seems to manage to build on Linux which is unfortunate because I have almost no skill with Windows tooling. The build uses a pretty incredible amount of memory, so my first guess was that the object file is getting truncated or otherwise mangled because it's run over some limit in the linker or LLVM or in rustc.
So I think it's possible this is a limitation of the Windows linker, or the code that breaks rustc/LLVM is being cfg(windows)
, perhaps behind a dependency.
I did get this error when I switched from nightly to stable which suggests something is wrong:
error[E0514]: found crate `utensor_main` compiled by an incompatible version of rustc
--> utensor_main/src/main.rs:4:5
|
4 | use utensor_main::rayon::iter::ParallelIterator;
| ^^^^^^^^^^^^
|
= note: the following crate versions were found:
crate `utensor_main` compiled by rustc 1.76.0-nightly (6a6287132 2023-12-17): /tmp/utensor-rs/target/release/deps/libutensor_main-c9123daf0cdb3b52.rlib
= help: please recompile that crate using this compiler (rustc 1.76.0-nightly (3f28fe133 2023-12-18)) (consider running `cargo clean` first)
It is also what I am thinking at, the memory problem. This build generate I think like at least 1000 of operations template, for example: bool + bool = bool, f32 + f64 = f64, stuff like that, generally, each operation would need to generate around 150 functions. But since there are not many, only add, sub, mul, div, rem, shl, shr, etc, shouldn't be that many. But anyway, since everything has to be implement in the same crate, so the amount of the work might huge. Do you think it is solvable?
I think this is a link.exe
limitation or bug. LLVM's linker LLD works just fine. I really don't know how to use Windows; normally I'd configure the linker with an environment variable, but I can also tell that
cargo rustc -- -Clinker=lld-link.exe
will build just fine. Any other way of setting your linker to lld-link.exe
should be able to build.
Library files are a type of archive containing many object files. The first file in the archive is a directory of public symbols that's just called /
. However, the library file in the OP starts with a file called /SYM64/
which is not understood by MSVC. It is by LLVM which is why using lld-link works.
So the issue is that rustc is producing libraries in a format that MSVC does not understand and then passing them to the MSVC linker. Note that /SYM64/
is an extension that allows for a larger symbol table which is likely the reason it's being used.
I'm pretty sure this is a compiler bug.
Oh, the problem with /SYM64/
in msvc rlibs has come up before in #88351. cc @ehuss, do you happen to have any more context here?
Sorry, I have not tested 64-bit archive lookup tables with link.exe. The /
entry fundamentally only supports 32-bit offsets, so when the file size exceeds 4GB, it can only use /SYM64/
.
which is not understood by MSVC
Can you say how you determined that is the issue?
If link.exe does not support 64-bit archive lookup tables, then I don't think there is anything rustc can do here?
Well it can, for example, use two archives instead of one, no? Unless there's really no way to arrange things to fit then I support it would have to error.
Can you say how you determined that is the issue?
By looking at the rlib that failed in a hex editor. See also Archive (Library) File Format.
Well it can, for example, use two archives instead of one, no?
Cargo and basically every other build tool expect only a single file to be emitted, --emit link=/path/to/rlib
only accepts a single output file and --extern
only accepts a single input file per crate. And finally given a single object file with enough large symbols the symbol table itself can exceed 4GB and thus push the object file beyond 4GB. In other words it would be possible but require a large coordinated effort across all build tools that support cargo to support this, some of which only support a single output file produced by a build step.
That's only true for rlibs. What matters here is only what's being sent to the linker, not the format of rlibs in general (which iirc are completely undocumented).
Yes it's convenient if rlibs are directly compatible with the linker but if that's not possible a conversion step can happen completely invisibly when it's time to link.
I guess. Wouldn't work for staticlibs though.
By looking at the rlib that failed in a hex editor. See also Archive (Library) File Format.
My question was more of how you determined that not having a /
entry was the issue? The linker error only says LNK4003: invalid library format; library ignored
. I'm wondering if there is some other issue, perhaps it just doesn't like some other aspect of the archive format. For example, maybe the format of the /SYM64/
is missing something, or it is expecting a //
entry, or something else.
As you can see from Archive (Library) File Format (linked earlier), in msvc a lib file must start with !<arch>\n
. This is then directly followed by the header of what it calls the "first linker member". The "Name" field is the first field in the header and we're told that the "name of the first linker member is /
". The name field is padded with spaces thus that becomes /
. So putting that together, a valid msvc lib must start with:
Text | Hex |
---|---|
!<arch>\n/ |
21 3C 61 72 63 68 3E 0A 2F 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |
Opening up in the "invalid" lib file in a hex editor I see
Text | Hex |
---|---|
!<arch>\n/SYM64/ |
21 3C 61 72 63 68 3E 0A 2F 53 59 4D 36 34 2F 20 20 20 20 20 20 20 20 20 |
The first member is not called /
so it's not valid according to the spec.
Might there be other problems with the /SYM64/
table? That's pretty irrelevant given the lack of support for it in the first place.
Oh, I see! Thanks! Yea, searching around it seems that 64-bit archives simply aren't supported. 😦
My program can get compile when I use --release, but if I don't use it, it will failed.
Platform: Windows 11