Open Quuxplusone opened 3 years ago
Bugzilla Link | PR51469 |
Status | NEW |
Importance | P enhancement |
Reported by | Ye Luo (xw111luoye@gmail.com) |
Reported on | 2021-08-12 19:54:07 -0700 |
Last modified on | 2021-08-16 05:58:21 -0700 |
Version | unspecified |
Hardware | PC Linux |
CC | huberjn@ornl.gov, jdoerfert@anl.gov, llvm-bugs@lists.llvm.org |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
The "clang-offload-bundler" tool will create an empty cubin file with "-allow-missing-bundles" set. Unfortunately, nvlink doesn't like empty files so it returns an error. I'm not sure if there's a reasonable solution to this besides just stipulating that everything is always compiled with -fopenmp -fopenmp-targets
to use offloading.
The offload-bundler tool is supposed to be primarily agnostic, so I don't think we could make it specially emit an empty but valid ELF file for the device. In the driver we set up a compilation using static information from the compiler invocation. This means we can't change the pipeline based on whether or not a certain binary wasn't bundled. This would be a lot easier if nvlink accepted empty files as a no-op.
Thanks. Unfortunately the behavior of nvlink is not what we have control. Does the "clang-offload-bundler" tool have ways to signal later stages of the linking pipeline about missing bundles? For example, not generating empty files to indicate the missing. Then later stage may make decisions based on the existence of files. Or the later stages check the size of cubin files before feeding them to nvlink.
The compilation pipeline is constructed before any of the steps are run, so there isn't a way to change it based off of runtime information as far as I know. The only solution I can think of is to check for empty files before we run the executable in the toolchain, but I'm not sure if that's an good solution because it might change the behaviour somewhere else.
I believe treating empty file is generally undefined behavior. It may means no-op desired or a corrupted file to be safely handled.
I'm not sure what the wider consensus on this is. Another solution would be to change the offload-bundler to have an option to output if it's empty. Then we'd just execute and wait while building the pipeline. I think AMDGPU's pipeline uses a similar method to determine the correct architecture to use.