Open Quuxplusone opened 7 years ago
Bugzilla Link | PR31041 |
Status | NEW |
Importance | P normal |
Reported by | Justin Lebar (:jlebar) (justin.lebar@gmail.com) |
Reported on | 2016-11-16 19:55:52 -0800 |
Last modified on | 2016-11-17 13:11:57 -0800 |
Version | unspecified |
Hardware | PC Linux |
CC | hfinkel@anl.gov, llvm-bugs@lists.llvm.org, sfantao@us.ibm.com, tra@google.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
I think that depends on the behavior you want for CUDA. If you do:
$ echo | llvm-run clang++ -E -x cuda - -o -
a user would expect both host and device code to be generated, I think. However
in CUDA that only happens when you reach the injection phase. Based on that,
and to keep the same behavior, I think one should do 1), i.e. emit a better
diagnostic and suggest to the user the options --cuda-host/device-only.
For OpenMP, this is not much of a problem given that human readable files are
bundled, and can be used seamlessly in separate compilation. You can leverage
that feature for CUDA too if that is interesting to have.
Thanks,
Samuel
Compiler *does* produce multiple outputs for -E which you can verify with -###
or observing preprocessor output itself.
You've correctly inferred that host side preprocessing happened, but device
side happend as well. The reason you don't see second XXX is that you've given
the pipe as an input and everything you've echoed got consumed by host
compilation. That will cause troubles if input can't be consumed more than once.
The reasoning behind current behavior:
Explicitly specified -o FOO implies that the output will be stored in file
named exactly FOO. In case of cuda that does more than one compilation under
the hood, -o may be ambiguous, depending on when in the pipeline you specify
it. I.e. -o FFF for the final object is OK. -o FOO for assembler or
preprocessor is not because you will get different output on host and device
side and one would clobber another.
If -o is not specified, driver is free to generate whatever name it wants and
thus we're not constrained by one-explicity-named-output.
The reason you don't see second XXX is that you've given the pipe as an input and everything you've echoed got consumed by host compilation.
Ah, okay. I've verified this is right.
If -o is not specified, driver is free to generate whatever name it wants and thus we're not constrained by one-explicity-named-output.
Okay, and with -S without -o, we do generate both host and device assembly files. I was seeing only one for the same reason as we were only getting one with -E -- the fact that I was piping the input.
I would still like to improve the error message here, because I've now had users ask me about this on two separate occasions. But I guess the behavior makes sense.