KhronosGroup / SPIRV-LLVM-Translator

A tool and a library for bi-directional translation between SPIR-V and LLVM IR
Other
469 stars 211 forks source link

Translation of BIFs from different frontends #206

Open AnastasiaStulova opened 5 years ago

AnastasiaStulova commented 5 years ago

Currently Translator uses IR metadata to detect the original source of translation module. Do we need any different solution?

AlexeySachkov commented 5 years ago

As @Naghasan pointed here we cannot rely on original source of translation module.

May be it is better to work on SPIR-V friendly LLVM IR to make it universal enough to handle OpenCL C, OpenCL C++, SYCL and different shader languages?

AnastasiaStulova commented 5 years ago

As @Naghasan pointed here we cannot rely on original source of translation module.

I actually mean the translation from LLVM IR to SPIR-V.

May be it is better to work on SPIR-V friendly LLVM IR to make it universal enough to handle OpenCL C, OpenCL C++, SYCL and different shader languages?

I think we need to see how it aligns with Clang CodeGen phase. Do you have concrete ideas how it could work?

Btw Clang IL was discussed on cfe-dev recently http://lists.llvm.org/pipermail/cfe-dev/2019-February/061450.html. I found it quite interesting and I was wondering if it can work for SPIR-V generation.

Naghasan commented 5 years ago

Btw Clang IL was discussed on cfe-dev recently http://lists.llvm.org/pipermail/cfe-dev/2019-February/061450.html. I found it quite interesting and I was wondering if it can work for SPIR-V generation.

I believe generating SPIR-V from Clang is what the DX Shader compiler is doing https://github.com/microsoft/DirectXShaderCompiler/tree/master/tools/clang/lib/SPIRV

I think we need to see how it aligns with Clang CodeGen phase.

I'm not sure if it is that interesting to be clang based rather than llvm based:

AnastasiaStulova commented 5 years ago

Input needed: understanding of current issues in the translator for translation of BIFs from various frontends.

AlexeySachkov commented 5 years ago

I actually mean the translation from LLVM IR to SPIR-V.

I agree that PR I linked might not be very relevant, but anyway we have several places checking LLVM IR metadata to understand source language and pre-process LLVM IR:

Input needed: understanding of current issues in the translator for translation of BIFs from various frontends.

Do you have concrete ideas how it could work?

I don't have any well-prepared design, but I could propose some ideas (which I hope could help):

  1. To simplify the translator we should have well-documented mapping from built-in function and type names to corresponding SPIR-V intrinsics.
    • Pretty simple translator, no more checks for source language
    • This will require actions from each front-end which wants to get its output translated to SPIR-V, but it can be implemented as thin adapter and shouldn't be impossible or very hard
    • I guess there are a lot of details in SPIR-V spec which need to be covered and have LLVM IR counterparts
  2. Add special @llvm.spirv.* intrinsics to LLVM IR to represent corresponding SPIR-V instructions
    • Pretty the same as previous
  3. Add extensible infrastructure to map from any vendor-/target-specific LLVM IR intrinsic to a SPIR-V functions:
    • Pseudo-code: OpCode OC = getKind(llvm::CallInst *CI); if (isValid(OC)) translateAs(OC, CI)
    • We could add some restrictions on getKind function that it cannot try to demangle something and must only handle intrinsics like @llvm.* to avoid OpenCL C built-in vs OpenCL C++ user's function mess
    • Every vendor could extend this function to tell translator how to handle it's own specific intrinsics
    • We could potentially generate such mapping semi-automatically from .td files from LLVM Targets
    • I think that this might be similar to translation between MLIR dialects, but I haven't seen MLIR tutorial from the latest EuroLLVM, cannot say specifically
AlexeySachkov commented 5 years ago

One more idea from @asavonic: we could use function attributes to specify how to translate particular calls, something like:

%0 = call i32 some_function(i32 %arg1) #0

attributes #0 = { "__spirv=OpExtInst(asin)" }
; calls that marked with such attributes should be translated to SPIR-V instructions which are encoded
; in the attribute

This solution allows us avoid modifying LLVM - each vendor can update it's front-end to emit additional attributes to enable translation BIFs from any language into SPIR-V

asavonic commented 5 years ago

One more idea from @asavonic: we could use function attributes to specify how to translate particular calls

It's better to add this attribute to a function declaration. Different calls of the same function should have the same SPIR-V translation, so putting an attribute for each call is redundant.

This solution allows us avoid modifying LLVM - each vendor can update it's front-end to emit additional attributes to enable translation BIFs from any language into SPIR-V

We probably need a new attribute in Clang to convey this information from a header file where a built-in is declared to LLVM IR.

AnastasiaStulova commented 4 years ago

We should separate 2 issues covered by this thread:

  1. Detecting builtin functions
  2. Recovering specific overload (i.e. demangling the name)

While (1) is needed for most of tool flows I believe, (2) might be limited to those that generate SPIR-V or need to recover source info for some other reasons.

Several solutions have been discussed and here are the summary of 2 that can solve both issue (1) and (2) at the frontend level allowing multiple frontends that generate LLVM IR target conversion to SPIR-V either through SPIRV-LLVM Translator or SPIR-V backend or even through clspv.

Solution 1.

Use of LLVM intrinsics instead of mangled functions. This will require a lot of builtin functions (~1000+) added to shared Clang/LLVM codebase and additionally results in:

A. Extensibility problem B. Require downstream changes for vendor builtins

Also it might not be very suitable for the flow that doesn't really target SPIR-V as a lot of builtins are implemented using regular functions. I assume we could workaround it by introducing the wrapper functions that would just call intrinsics from regular OpenCL builtins for SPIR-V but then we would have to maintain and link such separate wrapper libs in SPIR-V compilation flow. It could work but seems like large change overall and still has issues (A) and (B). Also we need to add SPIR-V target/backend first to be able to add instrinsics. So it will add some latency. Not clear how LLVM community will react on adding so many builtins either.

As for speed - extra builtins are going to appear indirectly in some switch/case statements that can result in extra time for checking the case statements. Mapping to intrinsics is trivial and shouldn't affect speed a lot. However linking to wrapper libraries is an extra overhead.

Solution 2.

Using attributes. We can introduce a source attribute (let's say we call it _openclbultin or even more generic _langbuiltin). This will then be processed by Clang CodeGen that can generate LLVM function attribute i.e. _openclbuiltin/_langbuiltin with original source function name emitted in a value of the attribute.

Example:

We could theoretically use metadata instead of attributes too if adding attribute would be an issue but it seems attributes are recommended way of adding meta information in LLVM - they incur less RT overhead and are guaranteed to be preserved by the transformation passes.

This is relatively small and easy change and requires less infrastructure change and maintenance compared to Solution 1. The attributes can be used directly by the translator or SPIR-V backend. Regarding the speed - this solution shouldn't affect compilation speed much because we only process functions that are being called in Clang's CodeGen. However it slightly increases the size of generated LLVM bc file due to more attributes emitted.

Here is the diff for the prototype of this solution in Clang:

diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index d9ca121b651..f8f289cb5bc 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1518,6 +1518,12 @@ def Convergent : InheritableAttr {
 let Documentation = [ConvergentDocs];
 }

+def LangBuiltin : InheritableAttr {
+ let Spellings = [Clang<"lang_builtin">];
+ let Subjects = SubjectList<[Function]>;
+ let Documentation = [Undocumented];
+}
+
 def NoInline : InheritableAttr {
 let Spellings = [GCC<"noinline">, Declspec<"noinline">];
 let Subjects = SubjectList<[Function]>;
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index b49b194d611..fb896c5d00a 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -1893,6 +1893,8 @@ void CodeGenModule::ConstructAttributeList(
 llvm::for_each(NBA->builtinNames(), AddNoBuiltinAttr);
 }
 }
+ if (TargetDecl->hasAttr<LangBuiltinAttr>())
+ FuncAttrs.addAttribute("lang_builtin", Fn->getNameInfo().getAsString());
 }

 // 'const', 'pure' and 'noalias' attributed functions are also nounwind.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 50951ad6022..2061cdd184b 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -7159,6 +7159,9 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
 case ParsedAttr::AT_Convergent:
 handleSimpleAttribute<ConvergentAttr>(S, D, AL);
 break;
+ case ParsedAttr::AT_LangBuiltin:
+ handleSimpleAttribute<LangBuiltinAttr>(S, D, AL);
+ break;
 case ParsedAttr::AT_NoInline:
 handleSimpleAttribute<NoInlineAttr>(S, D, AL);
 break;

After applying the diff it can be tested on the example:

1 void cos(float a) __attribute__((lang_builtin));
 2 void sin(float a) __attribute__((lang_builtin));
 3 
 4 kernel void k(){
 5 cos(1.2);
 6 }

where Clang will emit desired output:

define spir_kernel void @k() local_unnamed_addr #0 !kernel_arg_addr_space !3 !kernel_arg_access_qual !3 !kernel_arg_type !3 !kernel_arg_base_type !3 !kernel_arg_type_qual !3 {
entry:
 tail call void @_Z3cosf(float 0x3FF3333340000000) #2
 ret void
}
; Function Attrs: convergent
declare void @_Z3cosf(float) local_unnamed_addr #1

attributes #0 = \{ convergent nounwind "correctly-rounded-divide-sqrt-fp-math"="false" "denorms-are-zero"="false" "disable-tail-calls"="false" "frame-pointer"="none" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" "uniform-work-group-size"="true" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = \{ convergent "correctly-rounded-divide-sqrt-fp-math"="false" "denorms-are-zero"="false" "disable-tail-calls"="false" "frame-pointer"="none" "lang_builtin"="cos" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = \{ convergent nounwind "lang_builtin"="cos" }

Note that similar to convergent it seems to add the attribute not only to the declaration but also on a call. Not sure it's done to reduce the speed of looking up the attributes. Potentially we might not need it for this case.

Summary and comparison

Solution 1:

Pros:

Cons:

Solution 2:

Pros:

Cons:

mantognini commented 4 years ago

Solution 2 sounds the better option for Arm, as it requires a smaller patch and integrates well in our downstream codebase. The language agnostic lang_builtin sounds especially good for non-SPIR-V flows.

Has anyone concerns about this strategy?

AnastasiaStulova commented 4 years ago

I. It seems there is also signess information needed for the translation to SPIR-V from LLVM. This is needed in particular for

We could add a signess (from the first parameter?) into the generated LLVM attribute. However it seems too specific to the problem and therefore perhaps it shouldn't be proposed as a generic language attribute but rather OpenCL specific attribute.

II. Regarding the reverse conversion into LLVM, I am confused about whether we need to preserve the original mangled name of a builtin or we need to generate some neutral name that can be fixed up later depending on the language that is being used?

I now start to think that it might not be easy to get rid completely of a mangling problem but however I would at least like to solve the issue of detecting if something is a builtin function for C++ based frontends instead of replying on the fact that any mangled function is a builtin function.

StuartDBrady commented 4 years ago

There are several other OpenCL C built-in functions that would seem to require signedness information for their correct translation into SPIR-V ops. Note that some of these are translated into OpExtInst instructions using the OpenCL Extended Instruction Set ("OpenCL.std"). The following seem to me to differ in semantics depending on the signedness of their parameters:

In some cases, it seems fine for the round trip conversion from LLVM IR to SPIR-V to LLVM IR to change the mangling of OpenCL C builtin functions.

Going from LLVM IR -> SPIR-V, it seems some degree of flexibility will be required to support different frontends. Going from SPIR-V -> LLVM IR, I would agree that the original source language should not be a concern of the translator, so I would regard this as a namespacing problem. There is also a problem of mangling correctly for the target, but the two problems aren't entirely orthogonal.

Naghasan commented 4 years ago

[Rushed answer before the call]

LLVM intrinsic handles some kind of "overloading" (in quotes because it is not C++ overloading).

So for instance, for OpGroupAsyncCopy instrinsics we could have something like llvm.OpGroupAsyncCopy.p3i8.p1i8 and llvm.OpGroupAsyncCopy.p3i16.p1i16 that actually refers to the same intrinsics. But I'm unsure we can represent via the standard llvm "overloads" all the different types (i.e. what about signedness).

Also, llvm instrinsics are functions that start with "llvm.", they may have an enum attached to them if known to the llvm build, but that's optional (Function::isIntrinsic() can be true while Function::getIntrinsicID() == Intrinsic::not_intrinsic at the same time, https://llvm.org/doxygen/classllvm_1_1Function.html#a900a32da3983469187b1848189681705). This means, we don't need any modification of an upstream llvm, they do not even need to be described in a td file. I don't think this is a recommended way though (more on that below).

From the cons list

Large scale change

It is, but so is the solution 2 as it impacts the conversion, either way there is work on that side.

Also there is now tablegen file describing OpenCL buitins for this side of things (thanks ARM btw :+1: )

Maintenance burden

I don't see where the extra burden is.

Will require downstream changes for vendor builtins

I don't see why solution 1 or 2 would particularly protect downstream users.

Potential slows down compilation

if we add 1000+ sure, but that's not really the case.

Potentially need to wait for SPIR-V to become legal target in LLVM

I don't see why that would be the case

But overall, I don't think that using LLVM instrinsics is entirely adapted. But I have an hybrid solution.

This is interesting. I quite like the idea that basically any frontend just have to emit an extra information to id the builtin, this leave more freedom over how to express/implement this.

Although I will point out that this still heavily impact the translator, without solving the bidirectionality problem (may actually worsen it):

Let assume I have lib.spv and my_kernel.cpp which both uses a builtin A (A being the name in the spec).

By design, we are enforcing that this is never guarantee to work


llvm-spirv -r lib.spv -o lib.bc 

clang -emit-llvm my_kernel.cpp -o my_kernel.bc 

llvm-link my_kernel.bc  lib.bc -o my_kernel.bc 

builtin A now appears in 2 different forms in the same module.

My comment on the pros/cons

Small scale change

Scalable for new and vendor builtins

I don't what extra benefit this offer over the solution one in terms of scale/scalability, you still need lowering, translation, validation etc. This still heavily impact the translator.

Might be too specific with no use cases in LLVM upstream until we add SPIR-V backend.

Agreed-ish, front end want/need to emit kernel (C, C++, D, Julia, Rust, ) and they are not all clang based. I think this idea could help.

Regarding the reverse conversion into LLVM, I am confused about whether
we need to preserve the original mangled name of a builtin or we need to
generate some neutral name that can be fixed up later depending on
the language that is being used? 

See above for the why it is important to maintain names. I will also add this make approach like the libclc unusable/unscalable under such an environment. We have extended libclc to support SPIR-V builtins as outputed by the intel SYCL compiler (https://github.com/intel/llvm/tree/sycl/libclc/generic/libspirv) because the builtin translation is not stable, the use case above (basically linking a library) is unnecessarily hard.

that can be fixed up later depending on the language that is being used?

You cannot reliably infer what the source language is (if that's what you meant). But this could be trigger by a flag.


I will point out that the comment focus on OpenCL builtins exclusively and emitted from clang OpenCL derived approach. This is a SPIR-V translator... builtins should be about processing SPIR-V instruction represented as function calls. How an OpenCL builtin maps to a SPIR-V one is a totally different question (Note: I think the translator should support that, I'm only saying we should split the 2 questions).

It sounds to me that this projects more and more needs to strongly split the 2 jobs that it is doing (I'll be captain obvious here):

Naghasan commented 4 years ago

More on the builtin part:

Stable LLVM-SPIRV-canon representation

Revolving around the intrinsic idea, moving forward we could have the spir-v builtins represented as pseudo intrinsic: __spirv.<opname>.<ty1>.<ty2>, where <ty1> and <ty2> are types added to disambiguate overload. The advantage is this becomes more trivial:

Frontend SPIR-V builtin to LLVM-SPIRV-canon transformation

Now that we have what to convert to, we could reuse the attribute approach to drive a renaming of frontend function to the canonical one (I'm ignoring the sign problem for now):

define spir_kernel void @k() {
entry:
 tail call void @_Z3cosf(float 0x3FF3333340000000) #2
 tail call void @_Z7barrieriii(i32 2, i32 2, i32 0) #3
 ret void
}
; Function Attrs: convergent
declare void @_Z3cosf(float) local_unnamed_addr #1

declare void @_Z7barrieriii(i32, i32, i32) local_unnamed_addr #4

attributes #1 = { convergent "spv_builtin"="ocl_cos" }
attributes #2 = { convergent nounwind "spv_builtin"="ocl_cos" }
attributes #3 = { convergent nounwind "spv_builtin"="OpControlBarrier" }
attributes #4 = { convergent nounwind "spv_builtin"="OpControlBarrier" }

Basically, after the pass we would get something like:

define spir_kernel void @k() {
entry:
 ; could also be __spirv.ocl_cos.float to mach llvm types
 tail call void @"__spirv.ocl_cos.f32"(float 0x3FF3333340000000) #2
 ; no trailing types as there is no overloads
 tail call void @"__spirv.OpControlBarrier"(i32 2, i32 2, i32 0) #3
 ret void
}
; Function Attrs: convergent
declare void @"__spirv.ocl_cos.f32"(float) local_unnamed_addr #1

declare void @"__spirv.OpControlBarrier"(i32, i32, i32) local_unnamed_addr #4

attributes #1 = { convergent "spv_builtin"="ocl_cos" }
attributes #2 = { convergent nounwind "spv_builtin"="ocl_cos" }
attributes #3 = { convergent nounwind "spv_builtin"="OpControlBarrier" }
attributes #4 = { convergent nounwind "spv_builtin"="OpControlBarrier" }

Frontend OpenCL builtin to LLVM-SPIRV-canon transformation

Similar as before, but also perform the OpenCL to SPIR-V lowering. lang_builtin or opencl_builtin would be used to differentiate from the SPIR-V attribute.

bader commented 4 years ago

@Naghasan, I agree that LLVM-SPIRV-canon should avoid using Itanium ABI mangling and llvm intrisics is the community recommended way for extending the LLVM IR with SPIR-V operations.

I'm not convinced that "spv_builtin" attribute should be preferred to the clang built-ins, but this is not the major issue here.

I'd like to note that adding full set of SPIR-V instructions makes LLVM-SPIRV-canon ambiguous. Some SPIR-V opcodes mapped to LLVM instructions (e.g. add) or LLVM intrinsics (e.g. sqrt, see https://llvm.org/docs/LangRef.html#standard-c-library-intrinsics for more cases). We probably should document that LLVM standard instructions/intrinsics always win.

Naghasan commented 4 years ago

I'm not convinced that "spv_builtin" attribute should be preferred to the clang built-ins, but this is not the major issue here.

Yes, it was actually only to distinguish with the opencl builtins in the text above.

I'd like to note that adding full set of SPIR-V instructions makes LLVM-SPIRV-canon ambiguous. Some SPIR-V opcodes mapped to LLVM instructions (e.g. add) or LLVM intrinsics (e.g. sqrt, see https://llvm.org/docs/LangRef.html#standard-c-library-intrinsics for more cases). We probably should document that LLVM standard instructions/intrinsics always win.

Glad you point this out, I'm not advocating that we replace exisiting instruction/intrinsic. The translator should handle them naturally, this would be unusable otherwise.

AnastasiaStulova commented 4 years ago

@Naghasan I like the idea of SPIR-V canon representation. It is related to SPIR-V friendly format that we have started discussing some time ago. Can you please explain where would the transformation passes belong to in the tooling flow? I would particularly be interested in those that transform from OpenCL C to SPIR-V canon representation? Also while translating from SPIR-V I guess we will need to know the original source language information too i.e. assuming I translated to SPIR-V from SYCL source then if I want to convert back to LLVM IR for linking/optimizations etc I would need to get the mangled names of builtin functions right in order to be able to link them correctly in... or would we have to switch all the builtin functions to intrinsics or maintain two different flows with SPIR-V intrinsics and w/o?

I'm not convinced that "spv_builtin" attribute should be preferred to the clang built-ins, but this is not the major issue here.

Yes, it was actually only to distinguish with the opencl builtins in the text above.

@bader I would like us to take into consideration the overall tooling flow rather than individual components. Certainly, it wouldn't be good if by simplifying one tool we add a lot of extra complexity into another. Currently, we don't have a lot of Clang builtins for OpenCL. So if we are to add them it would be annoying to have them being duplicated just for one specific target. Clang bulitins have more complexity compared to just regular functions with attributes. They are also less friendly to the downstream implementations because they require in downstream modifications in the code base if new functionality is to be added. Functions with attributes can be easily used in the vendor headers should they decide to add their own extensions. No need to modify Clang code base.

Also I still don't understand entirely the flow - would we then have regular functions for non-SPIRV case and enable Clang builtins just for SPIR-V compilation? Would this mean that we have to maintain two flows in the frontend for adding OpenCL BIFs? Considering that SPIR-V is not known/valid target in LLVM project at the moment how do we go about proposing this alternative SPIR-V compilation flow?

AnastasiaStulova commented 4 years ago

Prototype solution for Clang frontend: https://github.com/intel/llvm/pull/1345 for feedback.

AnastasiaStulova commented 4 years ago

Prototype solution for Clang frontend: intel/llvm#1345 for feedback.

It seems a better way to add BIFs than using Clang builtins. However, I still have a number of concerns/unknowns.

  1. What is the overall flow of mapping BIFs from the source languages (e.g. OpenCL C) to SPIR-V operations?
  2. This still introduces an extra mechanism for handling OpenCL BIFs that is different to the conventional flow that exists now. This extra flow will have to be maintained and extended adding extra burden to the community. Downstream implementation will require intree modifications for adding vendor extensions.
  3. There will be an extra barrier of adding this upstream until we have SPIR-V target officially supported by the LLVM project.

Considering various concerns and goals may I suggest that for the time being the Translator will support two formats:

  1. Conventional OpenCL LLVM IR format generated from Clang for OpenCL C or C++ for OpenCL using mangling of BIFs. We can extend the Translator architecture so that it is extensible towards custom mangling schemes.
  2. SPIR-V cannon format that is to be generated by all new/other frontends.
Naghasan commented 4 years ago

It seems a better way to add BIFs than using Clang builtins. However, I still have a number of concerns/unknowns.

This PR is more a "distraction" IMO, I brought it for future ref to experiment with different bindings. That's one way to do it, I'm not advocating this should be the defacto way.

I like the idea of SPIR-V canon representation. It is related to SPIR-V friendly format that we have started discussing some time ago. Can you please explain where would the transformation passes belong to in the tooling flow?

We already have some kind of "canon" form, although not quite well documented. The current pipeline does that, but also interleaves with passes to lower OpenCL code. We may want to split things.

What would be the SPIR-V canon format ? I assume it would be a form for which a transformation LLVM -> SPIR-V -> LLVM yield the same LLVM module (with SPIR-V builtin mangled in the same way).

What is the overall flow of mapping BIFs from the source languages (e.g. OpenCL C) to SPIR-V operations?

If you emit OpenCL C builtins, then the passes for OpenCL should lower to SPIR-V canon form. An alternative would be the translator loading a library which implements OpenCL C builtins in terms of SPIR-V canon ones. But we kind of already have this already. The process could also be driven/assisted by lang_builtin attribute.

If the frontend emits SPIR-V builtins, then they should be emitted in canon format. We could also use the lang_builtin attribute to assist the process, the frontend emits the name it wants the only restriction would be that the types used in the interface should be the one requested by the canon form. If I update my previous example:

define spir_kernel void @k() {
entry:
 tail call void @_some_mangling_for_cos(float 0x3FF3333340000000) #2
 tail call void @_some_mangling_for_ControlBarrier(i32 2, i32 2, i32 0) #3
 ret void
}
; Function Attrs: convergent
declare void @_some_mangling_for_cos(float) local_unnamed_addr #1
declare void @_some_mangling_for_ControlBarrier(i32, i32, i32) local_unnamed_addr #4

attributes #1 = { convergent "lang_builtin"="ocl_cos" }
attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" }
attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" }
attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }

A pass could rename into something like I suggested:

define spir_kernel void @k() {
entry:
 ; could also be __spirv.ocl_cos.float to mach llvm types
 tail call void @"__spirv.ocl_cos.f32"(float 0x3FF3333340000000) #2
 ; no trailing types as there is no overloads
 tail call void @"__spirv.OpControlBarrier"(i32 2, i32 2, i32 0) #3
 ret void
}
; Function Attrs: convergent
declare void @"__spirv.ocl_cos.f32"(float) local_unnamed_addr #1

declare void @"__spirv.OpControlBarrier"(i32, i32, i32) local_unnamed_addr #4

attributes #1 = { convergent "lang_builtin"="ocl_cos" }
attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" }
attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" }
attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }

(Note: This is for illustration purposes I think it is better to avoid distraction on names)

This still introduces an extra mechanism for handling OpenCL BIFs that is different to the conventional flow that exists now. This extra flow will have to be maintained and extended adding extra burden to the community. Downstream implementation will require intree modifications for adding vendor extensions. There will be an extra barrier of adding this upstream until we have SPIR-V target officially supported by the LLVM project.

I'm not sure of what extra mechanism you are talking about. As for modification for vendor extensions, there is not necessarily a 1-1 mapping between OpenCL C builtins and SPIR-V ones.

I think it is important for the community to maintain that support (AFAIK, this is the only format you can get using an upstream clang ATM).

AnastasiaStulova commented 4 years ago

We already have some kind of "canon" form, although not quite well documented. The current pipeline does that, but also interleaves with passes to lower OpenCL code. We may want to split things.

Where is this pipeline?

If you emit OpenCL C builtins, then the passes for OpenCL should lower to SPIR-V canon form.

Where will these passes be added to?

An alternative would be the translator loading a library which implements OpenCL C builtins in terms of SPIR-V canon ones. But we kind of already have this already.

This could work too. Where do you think we should keep the libraries?

If the frontend emits SPIR-V builtins, then they should be emitted in canon format. We could also use the lang_builtin attribute to assist the process, the frontend emits the name it wants the only restriction would be that the types used in the interface should be the one requested by the canon form.

What about signness of types?

If I update my previous example:

define spir_kernel void @k() { entry: tail call void @_some_mangling_for_cos(float 0x3FF3333340000000) #2 tail call void @_some_mangling_for_ControlBarrier(i32 2, i32 2, i32 0) #3 ret void } ; Function Attrs: convergent declare void @_some_mangling_for_cos(float) local_unnamed_addr #1 declare void @_some_mangling_for_ControlBarrier(i32, i32, i32) local_unnamed_addr #4

attributes #1 = { convergent "lang_builtin"="ocl_cos" } attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" } attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" } attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }

Is this proposal for upstream Clang or its fork? As for upstream it seems unnatural to me that the frontend like Clang that has no integration with SPIR-V would suddenly emit SPIR-V specific names. In fact, it feels unnatural to me that the frontend emits anything specific to an implementation of some external tool that isn't even an official target/backend of its compilation stack. If Clang starts emitting random things for every open-source tool that exists we will not go very far and it will be hard to work with its codebase.

LLVM project has its own IR. SPIR-V canon IR is not a canon IR for LLVM project. It is something that we need in order to use LLVM compiler with SPIR-V serialization format. For LLVM it should be absolutely transparent. If we add SPIR-V to LLVM as a target we can reconsider the situation, but I would still be against developing multiple mechanisms in the frontend dealing with language-specific details in different ways depending on the targets being compiled. It is just not a helpful flow for the reason I have stated earlier.

Naghasan commented 4 years ago

We already have some kind of "canon" form, although not quite well documented. The current pipeline does that, but also interleaves with passes to lower OpenCL code. We may want to split things.

Where is this pipeline?

Here https://github.com/KhronosGroup/SPIRV-LLVM-Translator/blob/master/lib/SPIRV/SPIRVWriter.cpp#L2507

If you emit OpenCL C builtins, then the passes for OpenCL should lower to SPIR-V canon form.

Where will these passes be added to?

To the pass manager, as done today.

An alternative would be the translator loading a library which implements OpenCL C builtins in terms of SPIR-V canon ones. But we kind of already have this already.

This could work too. Where do you think we should keep the libraries?

In this project ? It would make sense IMO as people use this to do build OpenCL -> SPIR-V tools.

But I forgot to mention, you loose the bi-directional for OpenCL kernel. It is ok if we want to stick to bi-dir for the SPIR-V canon format. But it is something important to mention.

If the frontend emits SPIR-V builtins, then they should be emitted in canon format. We could also use the lang_builtin attribute to assist the process, the frontend emits the name it wants the only restriction would be that the types used in the interface should be the one requested by the canon form.

What about signness of types?

It only matters for OpenCL C builtins, SPIR-V builtins (instructions) carries this information already, for instance the upsample in OpenCL C translates to s_upsample or u_upsample in SPIR-V depending on the signness.

For OpenCL C builtins, I'm unsure on what to do.

If I update my previous example: define spir_kernel void @ k() { entry: tail call void @_some_mangling_for_cos(float 0x3FF3333340000000) #2 tail call void @_some_mangling_for_ControlBarrier(i32 2, i32 2, i32 0) #3 ret void } ; Function Attrs: convergent declare void @_some_mangling_for_cos(float) local_unnamed_addr #1 declare void @_some_mangling_for_ControlBarrier(i32, i32, i32) local_unnamed_addr #4 attributes #1 = { convergent "lang_builtin"="ocl_cos" } attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" } attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" } attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }

Is this proposal for upstream Clang or its fork? As for upstream it seems unnatural to me that the frontend like Clang that has no integration with SPIR-V would suddenly emit SPIR-V specific names. In fact, it feels unnatural to me that the frontend emits anything specific to an implementation of some external tool that isn't even an official target/backend of its compilation stack. If Clang starts emitting random things for every open-source tool that exists we will not go very far and it will be hard to work with its codebase.

I'm only using your proposal here (solution 2)

SPIR-V canon IR is not a canon IR for LLVM project. It is something that we need in order to use LLVM compiler with SPIR-V serialization format. For LLVM it should be absolutely transparent.

But it remains LLVM IR that does not requires extension. The canon form is just a form that the translator translate to/from SPIR-V and yield the same result.

I would still be against developing multiple mechanisms in the frontend dealing with language-specific details in different ways depending on the targets being compiled.

Again, I'm not advocating this should be the defacto binding. I mention it in the case it could help to develop/test something else. I agree with you, assuming a fork of clang for this project is not helpful, I believe it is actually really bad.

AnastasiaStulova commented 4 years ago
    If you emit OpenCL C builtins, then the passes for OpenCL should lower to SPIR-V canon form.

Where will these passes be added to?

To the pass manager, as done today.

Pass manager of upstream LLVM?

    An alternative would be the translator loading a library which implements OpenCL C builtins in terms of SPIR-V canon ones. But we kind of already have this already.

This could work too. Where do you think we should keep the libraries?

In this project ? It would make sense IMO as people use this to do build OpenCL -> SPIR-V tools.

Seems very sensible.

But I forgot to mention, you loose the bi-directional for OpenCL kernel. It is ok if we want to stick to bi-dir for the SPIR-V canon format. But it is something important to mention.

Do you mean that if we translate to SPIR-V canon and back we can't recover original IR emitted by Clang frontend? In the above comment I suggest to keep the current OpenCL flow as is until we find a flow that works well for every case.

    If I update my previous example:
    define spir_kernel void @ k() {
    entry:
    tail call void @_some_mangling_for_cos(float 0x3FF3333340000000) #2
    tail call void @_some_mangling_for_ControlBarrier(i32 2, i32 2, i32 0) #3
    ret void
    }
    ; Function Attrs: convergent
    declare void @_some_mangling_for_cos(float) local_unnamed_addr #1
    declare void @_some_mangling_for_ControlBarrier(i32, i32, i32) local_unnamed_addr #4
    attributes #1 = { convergent "lang_builtin"="ocl_cos" }
    attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" }
    attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" }
    attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }

Is this proposal for upstream Clang or its fork? As for upstream it seems unnatural to me that the frontend like Clang that has no integration with SPIR-V would suddenly emit SPIR-V specific names. In fact, it feels unnatural to me that the frontend emits anything specific to an implementation of some external tool that isn't even an official target/backend of its compilation stack. If Clang starts emitting random things for every open-source tool that exists we will not go very far and it will be hard to work with its codebase.

I'm only using your proposal here (solution 2)

Not really. I don't suggest to add anything SPIR-V specific at all. I was suggesting to emit the original source name of the function w/o mangling. However, as it turns out name only is insufficient for the translation because we also need a sign information in some cases.

SPIR-V canon IR is not a canon IR for LLVM project. It is something that we need in order to use LLVM compiler with SPIR-V serialization format. For LLVM it should be absolutely transparent.

But it remains LLVM IR that does not requires extension. The canon form is just a form that the translator translate to/from SPIR-V and yield the same result.

Ok, but it has intrinsics that are not part of LLVM project and don't belong to any upstream targets.

Naghasan commented 4 years ago
    If you emit OpenCL C builtins, then the passes for OpenCL should lower to SPIR-V canon form.

Where will these passes be added to?

To the pass manager, as done today.

Pass manager of upstream LLVM?

Yes, as shown in the code linked above.

we can't recover original IR emitted by Clang frontend?

In the general case this is not achievable at all. Simple examples: ConstExpr needs to be transformed into instructions, some attributes have disappear (no mapping to SPIR-V) or some intrinsic don't mach what exist in SPIR-V. Unless we do hacks and start to hide behind debug/non semantic instructions to preserve all of it, we can't promise that.

SPIR-V was design so that SPIR-V -> LLVM -> SPIR-V could be done without any lose of information, not LLVM -> SPIR-V -> LLVM.

In the above comment I suggest to keep the current OpenCL flow as is until we find a flow that works well for every case.

For the specific of OpenCL C builtins, loading the library means we implement the builtins in terms of SPIR-V ones, for instance something like that for barrier:

void barrier(int flag) {
  __spirv_ControlBarrier(scope_flag, scope_flag, memsementic_flag);
}

But now you don't have the invert function, that's what I was pointing out.

Anyway, we have something that works for now for OpenCL C builtins, should keep it :)

Not really. I don't suggest to add anything SPIR-V specific at all.

I'm not either, I'm only using your attribute proposal:

Solution 2.

Using attributes. We can introduce a source attribute (let's say we call it opencl_bultin or even more generic lang_builtin). This will then be processed by Clang CodeGen that can generate LLVM function attribute i.e. opencl_builtin/lang_builtin with original source function name emitted in a value of the attribute.

I'm only using it in the context of SPIR-V. The translator would only process the only it recognizes.

I was suggesting to emit the original source name of the function w/o mangling. However, as it turns out name only is insufficient for the translation because we also need a sign information in some cases.

As you need to have overloading for most of the cases, that's impossible to achieve without some form of mangling.

Ok, but it has intrinsics that are not part of LLVM project and don't belong to any upstream targets.

I haven't used any llvm instrinsics

AnastasiaStulova commented 4 years ago
Not really. I don't suggest to add anything SPIR-V specific at all.

I'm not either, I'm only using your attribute proposal:

Solution 2.

Using attributes. We can introduce a source attribute (let's say we call it opencl_bultin or even more generic lang_builtin). This will then be processed by Clang CodeGen that can generate

LLVM function attribute i.e. opencl_builtin/lang_builtin with original source function name emitted in a value of the attribute.

I'm only using it in the context of SPIR-V. The translator would only process the only it recognizes.

So where do you suggest to produce the following output?

    attributes #1 = { convergent "lang_builtin"="ocl_cos" }
    attributes #2 = { convergent nounwind "lang_builtin"="ocl_cos" }
    attributes #3 = { convergent nounwind "lang_builtin"="OpControlBarrier" }
    attributes #4 = { convergent nounwind "lang_builtin"="OpControlBarrier" }
AnastasiaStulova commented 4 years ago

Discussed on the call Jun 18:

Note that we also need type or at least sign-ness of function parameters, that can be added by extending the above attribute or adding an extra attribute. More discussion is needed on this topic.

Naghasan commented 3 years ago

@AnastasiaStulova I'm answering here for you question https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/746#issuecomment-702130887 (to move the discussion outside the PR)

Using OpSource would make the translator behave differently depending on the value which goes against the spec. And in this case, a non-semantic op code would dictate if a module can be consumed or not.

Can you elaborate on this, please? I never understood why this ended up this way in the spec. It seems something that could help with tooling. I don't entirely understand the point of adding something that should not be used.

Because it is an IR, only the SPIR-V spec and OpenCL SPIR-V environment matters here. The language used to produce it doesn't help outside specific cases like debuggers (that was the intent I believe). But having a non-semantic instruction affecting the internal representation of a compiler would be rather odd as the semantic does not change.

Bottom line is, SPIR-V is agnostic w.r.t. the source language. Only the SPIR-V core and SPIR-V environment specification matters. It can produced using from SYCL, OpenCL C, OpenCL C++, C++ for OpenCL, Rust, D, , but this will never affect how to interpret the SPIR-V opcodes.

This is important especially for this point:

if I want to convert back to LLVM IR for linking/optimizations etc

I should be able to do this if one SPIR-V module is produce from SYCL and another from OpenCL C. For the same reason I can write half a program in C and the other in FORTRAN. Having a stable internal representation of the builtin regardless of the source is critical for this. That the translator offers different way to represent them is somehow orthogonal.

For instance, I may choose to build my tool chain around SPIR-V friendly functions for my builtins. In this case, I will probably want to use a preprocessing pass that lowers OpenCL builtins to the SPIR-V friendly ones so I get the same view. But I could choose to do the opposite well.

AnastasiaStulova commented 3 years ago

Having a stable internal representation of the builtin regardless of the source is critical for this.

But there is a stable representation of builtins in SPIR-V isn't it?

What we are discussing here is not a SPIR-V format itself but a conversion between two different IR formats. LLVM IR doesn't and won't have OpenCL builtin functions natively. We have to accept this as a fact. And therefore there will be different ways to represent the builtins in the LLVM IR. I understand that the translation could be simplified if all frontends used a unified IR format. But frontends might not be able to produce the unified format natively so the conversion from their LLVM IR into a translator LLVM IR format would have to happen somewhere anyway.

It seems that the original architecture of the translator was made to translate between LLVM IR produced from OpenCL C into SPIR-V and back but this architecture evolved organically and seems to be no longer suitable for SYCL or potentially other languages/frontends? However, I think we haven't had an alternative proposal that would align fully with the rest of the open-source tooling components in the stack. I feel this just need a volunteer to drive things forward.

AlexeySotkin commented 3 years ago

That the translator offers different way to represent them is somehow orthogonal.

I don't fully understand it, could you elaborate?

It seems that the original architecture of the translator was made to translate between LLVM IR produced from OpenCL C into SPIR-V and back but this architecture evolved organically and seems to be no longer suitable for SYCL or potentially other languages/frontends?

Right.

However, I think we haven't had an alternative proposal that would align fully with the rest of the open-source tooling components in the stack.

I'm not sure it is possible to have a proposal which will be fully aligned with the rest of the stack, probably some other components will have to adjust too.

AnastasiaStulova commented 3 years ago

I'm not sure it is possible to have a proposal which will be fully aligned with the rest of the stack, probably some other components will have to adjust too.

Sure, adjustments are expected as soon as they are reasonable!

Naghasan commented 3 years ago

Having a stable internal representation of the builtin regardless of the source is critical for this.

But there is a stable representation of builtins in SPIR-V isn't it?

Yes, my comment is in relation to the usage of OpSource to tell which representation to use. That's how I understood the comment in the PR, so my comment is in relation to this. Either is fine, and there is workarounds that can be put in place to keep using the OpenCL based builtins to avoid the issue in PR 746.

I'm not sure it is possible to have a proposal which will be fully aligned with the rest of the stack, probably some other components will have to adjust too.

Sure, adjustments are expected as soon as they are reasonable!

Yes, no magic bullets.