tensorflow / mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1.74k stars 259 forks source link

[spirv] Add pass to decorate types with layout info. #156

Closed denis0x0D closed 5 years ago

denis0x0D commented 5 years ago

This PR adds a pass to decorate SPIR-V types with layout info https://github.com/tensorflow/mlir/issues/128 Current implementaton decorates type of spirv::GlobalVariableOp if type is !spv.ptr<!spv.struct<...>> in strorage class StorageBuffer/Uniform/PushConstants and does not have layout info. Also it changes the type for the direct user of spirv::GlobalVariableOp such as spv.AddressOfOp. It does not change the type for the indirect users such as Load, Store, FunctionCall and so on.

@antiagainst @MaheshRavishankar can you please take a look? Thanks.

joker-eph commented 5 years ago

It seemed that in the bug @antiagainst was in the opinion that layout should always be present and it is the responsibility of the lowering to make sure it happens? (and so there would be a verifier for these to be there)

MaheshRavishankar commented 5 years ago

It seemed that in the bug @antiagainst was in the opinion that layout should always be present and it is the responsibility of the lowering to make sure it happens? (and so there would be a verifier for these to be there)

Reading the bug, the statement is that instead of having the burden on every lowering, there could be a pass that could assign layout. AFAICS, when the layout assignment happens is not directly related to verification. The verification can still check that the layout is specified. Lowering infrastructures just need to make sure this pass is run before the verification is run. IMO it is probably not worth having this in the verification itself, but rather as a check in the serialization. Though, having it in the verification is also OK.

joker-eph commented 5 years ago

I'm am still confused: the comment in the bug I'm referring to is:

During the lowering, we also need to generate properly laid out spv.struct types for global resources. This issue focuses on the later. (Although I think the layout calculation rule can be shared. :)

Is this what you're referring to? It seems to indicate that the layout will happen during the lowering, but this is different from having a separate pass. I understand this as the need to primitives/utility functions that can be involved during lowering to apply the layout rules.

We fundamentally can't have the lowering emitting invalid IR and a pass that operates on invalid IR.

MaheshRavishankar commented 5 years ago

I'm am still confused: the comment in the bug I'm referring to is:

During the lowering, we also need to generate properly laid out spv.struct types for global resources. This issue focuses on the later. (Although I think the layout calculation rule can be shared. :)

Is this what you're referring to? It seems to indicate that the layout will happen during the lowering, but this is different from having a separate pass. I understand this as the need to primitives/utility functions that can be involved during lowering to apply the layout rules.

We fundamentally can't have the lowering emitting invalid IR and a pass that operates on invalid IR.

First of all, I dont think I would classify it as illegal IR if the struct type is not laid out. It has to be laid before serialization of the SPIR-V module, but I dont think we should treat the IR where struct types are not laid out after lowering (i.e. some dialect -> SPIR-V dialect) as illegal. It just has to happen before the serialization. So the pass in this CL could be run "just before" the serialization step. If we do want to consider IR that doesnt have the struct laid out as illegal IR, the lowering (i.e. some dialect -> SPIR-V lowering) need not explicitly worry about the layout. Then before the lowering returns, it can invoke this pass to lay out the struct. So the lowering would create "legal" IR after return. The verifier can be enhanced to check that the structs are indeed laid out.

I am not sure I follow what the issue being raised here is. The pass is still useful to have. You can call it a "legalization pass" and the decision of when to invoke it is a separate concern

joker-eph commented 5 years ago

First of all, I dont think I would classify it as illegal IR if the struct type is not laid out

Well OK, but then that's a different story now. You originally wrote in the bug "I agree, it should be part of verifier too", were you using "verifier" with a different meaning than the MLIR verifier?

I am not sure I follow what the issue being raised here is.

The issue is one of invariant and IR design and make sure things are properly considered documented (which they may be, but it isn't clear to me from reading the bug, and I may have missed the documentation in the repo?). From what I read in the bug (and your explanation here: " It just has to happen before the serialization"), it is not valid in the SPIR-V spec to not have the layout. If so, we may be saying that in MLIR it is legal to not have it, but that means that no transformation or analysis that operate on SPIR-V will be able to assume that a layout is present and everywhere where it matter you will have to check if it is present. This may or may not be intended, but it probably deserves to be considered and spelled out. Deviation from the spec like this may not be intuitive for folks writing passes on SPIR-V dialect, so we should just be careful (in the doc and the API design) to make is hard to mis-use.

If we do want to consider IR that doesnt have the struct laid out as illegal IR, the lowering [...] can invoke this pass to lay out the struct. So the lowering would create "legal" IR after return. The verifier can be enhanced to check that the structs are indeed laid out.

That may be an OK workflow, but this should not be exposed as a pass: it is never appropriate for a pass to take as input illegal IR by definition.

I need to dig a bit more into the layout question here, I only had a superficial look on this so far.

MaheshRavishankar commented 5 years ago

@joker-eph : Fair point, I am not being very clear here. Let me restate

1) The SPIR-V dialect is meant to represent the SPIR-V spec, but its not a perfect mirror. There already many deviations to make it an amenable to optimization passes. You are right in saying we should have a clear spec of what is legal in SPIR-V dialect, and the verifier should check those. 2) At serialization time we have obviously make sure the serialized binary matches the SPIR-V spec. The serialization should catch any deviations, but the SPIR-V dialect spec should be such that a module with valid SPIR-V spec should always be serializable.

Coming to the layout question here, one way to view this is that all spirv::StructTypes will get some default layout. I don't think we have explicitly stated this anywhere, but it seems like using a C-style layout would be the way forward, and thats what is being done in the pass. So by default lowering passes don't need to explicitly specify the layout if they want to use the default (but they are free to specify a custom layout for a struct). Before serialization though, we need to materialize the default layout. This pass is doing that. Requiring that all lowering always specify the layout will probably result in a lot of redundancy since most common case would be what is being done here.

Given the above, I was wrong in saying that we should check that the layout is specified in the verifier. If the approach above is OK, then we dont need to. The serializer can report an error if the layout isn't specified for a struct.

Disclaimer: This is my understanding. @antiagainst might have more insight.

joker-eph commented 5 years ago

but the SPIR-V dialect spec should be such that a module with valid SPIR-V spec should always be serializable.

My understanding was that there was a clear goal for the SPIR-V dialect to have a "straightforward" translation to/from SPIR-V binary format. While I agree with "deviations to make it an amenable to optimization passes" (I argued in this direction in the past), this is different from having a valid SPIR-V IR that can't be directly serialized.

antiagainst commented 5 years ago

Sorry my previous reply on the bug wasn't clear enough. Let me expand it with more details. :)

A SPIR-V binary module contains not only the logic of the program (called shader/kernel in graphics/copute term), but also instructions for aiding the "interface" between the program and the application (or runtime in ML term). Examples for the former include OpFunction, OpIAdd, etc. Examples for the later include capabilities/extensions needed by this module (specified via OpCapability, OpExtension), layout information for resources (specified as OpDecorate on struct/array types), entry points (specified via OpEntryPoint, OpExecutionMode). Instructions for the later are at the very beginning of a binary module so that when a Vulkan driver trying to compile this module into its own ISA, it can reject early if the interface is incorrect instead of going through all the logic. Let's call the former as logic instructions and the later as interface instructions for the purpose of the discussion here. (There are also debug instructions not falling into either category, but let's ignore them for now.)

For logic instructions the verification is clear and unconditional. An OpIAdd can only be used adding two scalar or vector of integer values; anything other than that is wrong. But for interface instructions, it depends on the execution environment: Vulkan may require one way, and OpenCL may require another. Note that the requirement of "Composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, and PushConstant Storage Classes must be explicitly laid out" is in SPIR-V spec "2.16.2. Validation Rules for Shader Capabilities". So it should only be enforced when a SPIR-V module requires Shader capability, which is the case for Vulkan. It's valid to have a module without those annotations if we don't know the execution environment or it's one that does not support Shader capability. Also, these interface instructions are typically "duplicated" information from the main logic (like required extension or capability, which we can deduce by going over all instructions in a module) or do not have a unique way (for example any void(void) function in the module can potentially be an entry point for Vulkan).

So for a pass, we should have the assumption that all ops corresponding to logic instructions are valid. But for information corresponding to interface instructions, I think most passes don't really care about them. (General passes like const folding, CSE, or SPIR-V specific ones like freezing spec constant won't behave differently w.r.t. weather we target Vulkan 1.0 or Vulkan 1.1 or OpenCL.) To make a rough analogy with LLVM, the interface instructions are more like metadata from IR's perspective but are required for the interface with the application. I don't think we want to burden all of these passes to update the required capabilities or annotating layouts when generating new ops, like we have no guarantee regarding metadata in LLVM passes. It's better to add them with dedicated transformations, which will require external input regarding the application (e.g., what execution environment we are targeting and what layout rules we want to use). We don't have the support for different target execution environments right now; but we surely need and will add that for supporting like Vulkan 1.0 vs Vulkan 1.1.

joker-eph commented 5 years ago

Thanks Lei for the great explanation!

To make a rough analogy with LLVM, the interface instructions are more like metadata from IR's perspective but are required for the interface with the application.

LLVM metadata can always be dropped, is this the case here? LLVM has module flag which can be semantically meaningful and are likely closer to "interface" as here.

I don't think we want to burden all of these passes to update the required capabilities or annotating layouts when generating new ops, like we have no guarantee regarding metadata in LLVM passes.

If you present the capabilities and/or the layout are purely redundant with the IR, I'd argue to strip them entirely on import and add them on export. If you can't strip them, why? Are they meaningful then? If they are then the corollary seems like passes would need to maintain these informations no?

MaheshRavishankar commented 5 years ago

To unblock this work, can we reach a consensus. If having a pass is problematic, then maybe convert all the layout assignment logic here into a utility function. For now we can invoke this in the serializer to make sure the generated binary is spec compliant?

MaheshRavishankar commented 5 years ago

@denis0x0D : Looked through your changes. Seem OK to me. Will wait for response about whether having it as a pass or utility function from @antiagainst , @joker-eph to approve.

antiagainst commented 5 years ago

LLVM metadata can always be dropped, is this the case here?

It cannot. So I said this is a "rough analogy" to LLVM. :)

LLVM has module flag which can be semantically meaningful and are likely closer to "interface" as here.

That is good to know! Thanks for pointing this out!

If you present the capabilities and/or the layout are purely redundant with the IR, I'd argue to strip them entirely on import and add them on export. If you can't strip them, why? Are they meaningful then? If they are then the corollary seems like passes would need to maintain these informations no?

I think it depends on what we are talking about.

The layout information is decorated onto the type (both in the spec and in the SPIR-V dialect). If we have two structs with identical members but different layout, they are two different types inside SPIR-V. If we strip the information during import, we are losing this difference. Besides, there can exist multiple layout schemes and for different storage classes we can have different rules. If we strip them out during import and then re-export, we can generate modules that do not match with the original layout.

For layout I think it does not matter for passes since the layout (and other decorations on types) is baked into the type itself. So passes can operate on spv.structs with or without layout info similarly. I think we just need a dedicated pass to turn "layoutless" types into ones with layout (it will need CLI options for which layout scheme to use from the developer) at the final stage when we know about the execution environment so that we generate valid modules according to that execution environment.

For required extensions and capabilities, I was actually debating over it but settled down on the current approach: allow setting the required extensions/capabilities as attributes on spv.module. The reasons behind are:

Regarding the interaction with passes, I think most passes don't care either. We just need a dedicated pass at the final stage to go over all opcodes and operands to collect the needed extensions/capabilities and attach to the module.

antiagainst commented 5 years ago

Will wait for response about whether having it as a pass or utility function

I think making this as a pass makes sense for me. It's not that we are having wrong IRs in the middle; I think it as we are going from generic to specific and attaching more information along the way. (You can also view this as "lowering" I think.) Validations at different levels kicks in at different stages. :)

denis0x0D commented 5 years ago

IMHO, it's better to have a pass then hook in some other places, because it would be easy for me to assign my own memory layout (in case I GPU vendor and know which layout gives the best performance) and therefore easy to maintain, I'll not have all that pain while syncing my copy with trunk version. Thanks.

joker-eph commented 5 years ago

To unblock this work, can we reach a consensus. If having a pass is problematic,

I only saw this problematic if this was meant to be a verifier failure, which it isn't apparently, so this voids any concerns I had with having it as a pass.

joker-eph commented 5 years ago

If you present the capabilities and/or the layout are purely redundant with the IR, I'd argue to strip them entirely on import and add them on export. If you can't strip them, why? Are they meaningful then? If they are then the corollary seems like passes would need to maintain these informations no?

I think it depends on what we are talking about.

The layout information is decorated onto the type (both in the spec and in the SPIR-V dialect). If we have two structs with identical members but different layout, they are two different types inside SPIR-V.

I already see a problem with the wording here, you wrote that "the layout information is decorated onto the type", from which I take that that layout is not part of the type ; but then added that "two structs with identical members but different layout [..] are two different types".

If we strip them out during import and then re-export, we can generate modules that do not match with the original layout.

If you add layout on "layout less" types, then you are also not preserving the ability to generate modules that match the original layout.

Besides, there can exist multiple layout schemes and for different storage classes we can have different rules. If we strip them out during import and then re-export, we can generate modules that do not match with the original layout.

This link is interesting, it indicates that the frontend has to pick the layout when lowering to SPIR-V if I understand correctly. How can this be compatible with this pass then?

For layout I think it does not matter for passes since the layout (and other decorations on types) is baked into the type itself. So passes can operate on spv.structs with or without layout info similarly.

This is not clear to me. Canonicalization would need to account for the layout and properly propagate it / take it into account when rewriting operation / changing type (mem2reg kind of transformations for instance).

I think we just need a dedicated pass to turn "layoutless" types into ones with layout

You haven't made it clear why is "layout less" even a thing at this point? I think this is probably one critical piece I am missing right now.

(it will need CLI options for which layout scheme to use from the developer) at the final stage when we know about the execution environment so that we generate valid modules according to that execution environment.

It isn't clear to me that the layout isn't a contract with the host at some point, and that you can just do this...

denis0x0D commented 5 years ago

@antiagainst @MaheshRavishankar thanks a lot for review, updated according to the comments.

antiagainst commented 5 years ago

@joker-eph:

I already see a problem with the wording here, you wrote that "the layout information is decorated onto the type", from which I take that that layout is not part of the type ; but then added that "two structs with identical members but different layout [..] are two different types".

I think the terminology might be confusing here. Sorry about that!

Decoration is a SPIR-V spec term. Using struct type as an example, in the binary format, the main type is defined using OpTypeStruct. The layout information is decorated to the result id generated from OpTypeStruct with OpDecorate/OpMemberDecorate:

OpDecorate %0 Block
OpMemberDecorate %0 0 0
OpMemberDecorate %0 1 4
...
%int = OpTypeInt 32 0
%0 = OpTypeStruct %int %int
...

These decorations are semantic impacting. Although represented using different instructions, they should be considered as part of the type. We cannot load a value from a pointer to the above type as a struct without layout decorations. Structs with different layout decorations are different types.

In the SPIR-V dialect, such decorations are baked into the spirv::StructType: we have fields for them in the StructTypeStorage. The above is represented as

!spv.struct<i32 [0], i32 [4]>

You haven't made it clear why is "layout less" even a thing at this point? I think this is probably one critical piece I am missing right now.

Let me rephrase and provide more context to make it clear. :)

SPIR-V is intended to serve multiple APIs; Vulkan and OpenCL are two of them.

For Vulkan, the Shader capability is required. The Shader capability requires "Composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, and PushConstant Storage Classes must be explicitly laid out." (Spec 2.16.2). For OpenCL, the Kernel capability is required. The Kernel capability does not have the above requirement, i.e., composite types in those storage classes are not required to be laid out.

So, "layoutless" is the way for OpenCL. Passes for OpenCL-flavored SPIR-V operate on struct types without layout information. For Vulkan, the purpose of these layout information decoration (SPIR-V spec term again) is to match with the data layout in memory, which is set up and populated using C++ API. These information might be useful for driver compilers given that they will further compile SPIR-V into architecture-specific ISA with physical pointers, but they should not affect how transformations inside MLIR work, which is on logical addressing model. They are just additional information put in the SPIR-V module for further consumers (driver compilers) to consider when working together with the application code (written in C++). In a sense, it's like saying one function in the SPIR-V module is an entry point with name foo so the driver compiler can properly expose it when compiling the shader for compute pipeline. We just need to add such information at the end of the conversion flow inside MLIR (going from a high-level dialect). Normal passes should not care about it; they can just operate on a "layoutless" version.

Does the above make sense?

This is not clear to me. Canonicalization would need to account for the layout and properly propagate it / take it into account when rewriting operation / changing type (mem2reg kind of transformations for instance).

Is this resolved with the explanation in the above?

This link is interesting, it indicates that the frontend has to pick the layout when lowering to SPIR-V if I understand correctly. How can this be compatible with this pass then?

From a compiler's perspective, I don't think it's the frontend that should pick the layout; it's the SPIR-V backend. (The link I posted previously is for the SPIR-V backend inside DXC.)

How data is laid out in memory is external knowledge to the compiler. The compiler won't know at all if without input from the developer: the developer may want to use Vulkan rules to lay out the data, or may want to use DirectX rules (for interop with DirectX), or OpenGL rules (for interop with OpenGL), or whatever. These different rule set calculation and decoration should be provided as libraries so the developer can choose and use to match the layout they want to use on API/runtime side. This pass can be such a library for this purpose. Yes right now we only support Vulkan rules because that's what we are interested currently, but in the future I think this pass will be extended to support more rule sets.

It isn't clear to me that the layout isn't a contract with the host at some point, and that you can just do this...

Hopefully the above provides more context. Yes it's a contract with the host, so we need to get input from the developer and put the external information inside the module so further SPIR-V consumers have access to those information. Am I answering your question here?

If you add layout on "layout less" types, then you are also not preserving the ability to generate modules that match the original layout.

I'm not sure I get your question here fully. To target Vulkan, "layoutless" is more of an intermediate state. We need to add layout information at the end of the SPIR-V lowering. If we are importing a real-world SPIR-V blob, then those layout should already be inside the module and we should preserve them.

antiagainst commented 5 years ago

Sorry, clicked the wrong button...

denis0x0D commented 5 years ago

rebased on master to avoid "conflicting file" issue.

antiagainst commented 5 years ago

One final nit, could you update and elaborate the commit message a bit? Thanks!

denis0x0D commented 5 years ago

@antiagainst updated, thanks!

denis0x0D commented 5 years ago

@antiagainst rebased on master to follow up the commit https://github.com/tensorflow/mlir/commit/cea968877b68b0bca7d7e40cbae2e6cac741aec4

joker-eph commented 5 years ago

You haven't made it clear why is "layout less" even a thing at this point? I think this is probably one critical piece I am missing right now.

Let me rephrase and provide more context to make it clear. :)

SPIR-V is intended to serve multiple APIs; Vulkan and OpenCL are two of them.

For Vulkan, the Shader capability is required. The Shader capability requires "Composite objects in the StorageBuffer, PhysicalStorageBuffer, Uniform, and PushConstant Storage Classes must be explicitly laid out." (Spec 2.16.2). For OpenCL, the Kernel capability is required. The Kernel capability does not have the above requirement, i.e., composite types in those storage classes are not required to be laid out.

OK, but at this point the world seems fairly static: if you have the kernel capability you don't need layouts, otherwise you do. We're not yet to a point where we need to add a "default" one generically in the SPIR-V dialect itself.

I'm also curious if this is only valid in OpenCL for structure that don't communicate with the host and are only used on the device? Otherwise it isn't clear to me how do you get the data back on the host.

but they should not affect how transformations inside MLIR work, which is on logical addressing model.

It isn't clear to me that this is entirely true: are you saying that I can't emit SPIR-V code that includes some sort of casting that would rely on the layout being preserved? In C/C++, when struct are layout by clang into LLVM, the code is definitely not operating symbolically (in the simpler case: you can cast a pointer to a struct into an i8* and index into any member using the layout).

They are just additional information put in the SPIR-V module for further consumers (driver compilers) to consider when working together with the application code (written in C++).

Aren't they contract with the application code?

This is not clear to me. Canonicalization would need to account for the layout and properly propagate it / take it into account when rewriting operation / changing type (mem2reg kind of transformations for instance).

Is this resolved with the explanation in the above?

Not really: in LLVM the layout is used and taken into account by the passes (using the DataLayout). Years ago LLVM was trying to be agnostic and have passes be able to operate without datalayout, but in practice it didn't work out well (when you get into details of assumptions analysis have to make).

This link is interesting, it indicates that the frontend has to pick the layout when lowering to SPIR-V if I understand correctly. How can this be compatible with this pass then?

From a compiler's perspective, I don't think it's the frontend that should pick the layout; it's the SPIR-V backend. (The link I posted previously is for the SPIR-V backend inside DXC.)

This does not make sense to me, so I probably misunderstand what you wrote above. If the layout is a contract with the application code, then this contract has to be expressed by the application and cannot be picked a GPU driver independently of the application, otherwise in case of mismatch the host couldn't read the data. That is unless struct aren't exposed to the application code directly and

How data is laid out in memory is external knowledge to the compiler. The compiler won't know at all if without input from the developer: the developer may want to use Vulkan rules to lay out the data, or may want to use DirectX rules (for interop with DirectX), or OpenGL rules (for interop with OpenGL), or whatever.

This also seems to me to contradict the "I don't think it's the frontend that should pick the layout; it's the SPIR-V backend". Unless a SPIR-V kernel is supposed to be portable across Vulkan/OpenGL/OpenCL and only at runtime when you load SPIR-V into the driver you pick your model? (I doubt it considering how much specificity for each use-case is baked into the IR).

It isn't clear to me that the layout isn't a contract with the host at some point, and that you can just do this...

Hopefully the above provides more context. Yes it's a contract with the host, so we need to get input from the developer and put the external information inside the module so further SPIR-V consumers have access to those information. Am I answering your question here?

You're partially answering, but this is leaving me very confused about the whole story: it just does not line up for me at the moment.

If you add layout on "layout less" types, then you are also not preserving the ability to generate modules that match the original layout.

I'm not sure I get your question here fully. To target Vulkan, "layoutless" is more of an intermediate state. We need to add layout information at the end of the SPIR-V lowering. If we are importing a real-world SPIR-V blob, then those layout should already be inside the module and we should preserve them.

What I was saying is that if you import a SPIR-V module that is "layoutless" and you always add layout as part of your flow, then your not preserving the "layoutless" aspect of it.

antiagainst commented 5 years ago

OK, but at this point the world seems fairly static: if you have the kernel capability you don't need layouts, otherwise you do. We're not yet to a point where we need to add a "default" one generically in the SPIR-V dialect itself.

I'm not sure I get your idea here. Having layout information absent does not mean we force a "default" layout; it's simply we don't have such information.

I'm also curious if this is only valid in OpenCL for structure that don't communicate with the host and are only used on the device? Otherwise it isn't clear to me how do you get the data back on the host.

I'm less familiar with the OpenCL-flavored SPIR-V. My understanding is that OpenCL has relatively well defined layout rule (following C/C++) and that can be represented with a single decoration, e.g. CPacked, on the struct type. So there is no need to explicitly specify the offset of each struct member. On graphics side because of historical reasons and interop between different APIs, we have multiple rule sets and lots of flexibility here so need to explicitly specify everything.

It isn't clear to me that this is entirely true: are you saying that I can't emit SPIR-V code that includes some sort of casting that would rely on the layout being preserved? In C/C++, when struct are layout by clang into LLVM, the code is definitely not operating symbolically (in the simpler case: you can cast a pointer to a struct into an i8* and index into any member using the layout).

You can do pointer arithmetics to index into a struct/array member (which requires layout information) for OpenCL-flavored SPIR-V. But Vulkan supports logical addressing model; it has much restricted pointer usage. You cannot cast a struct/array pointer to an integer and then do pointer arithmetics to get its member. (You can find the list of casting ops in 3.32.11. Conversion Instructions. Note that those related to pointer <-> integer casting requires Kernel or Addresses capability, which is for OpenCL.) The way to index into a struct is use OpAccessChain, which is still logical addressing that layout does not matter (at SPIR-V level).

Aren't they contract with the application code?

Yes, they are contract with the application. So they are useful for further SPIR-V consumers (the driver compiler) when trying to "link" with the application. :) I don't quite see what's the issue with my original wording?

Not really: in LLVM the layout is used and taken into account by the passes (using the DataLayout). Years ago LLVM was trying to be agnostic and have passes be able to operate without datalayout, but in practice it didn't work out well (when you get into details of assumptions analysis have to make).

Oh, I think I understand your concerns better now by knowing the LLVM history on this. Yeah, C/C++ compilers cannot avoid the layout issue because it's rooted in the language. I guess this will be more of an issue if we want to fully support OpenCL, which has a language more akin to C/C++. For Vulkan, shader languages are fairly restricted and as said before, you cannot be very creative on pointers so it's less a concern.

However, I still see the benefits of this functionality being a pass. While we are focusing on Vulkan right now, it can serve both Vulkan and OpenCL. Being a pass means it can be chosen to apply at a suitable time. For OpenCL maybe we can apply it at the very beginning so following OpenCL specific passes can have concrete layout to operate on. For Vulkan, we just need it at the very last to make sure these "contract information for application" from the developer is carried into the final module.

This does not make sense to me, so I probably misunderstand what you wrote above. If the layout is a contract with the application code, then this contract has to be expressed by the application and cannot be picked a GPU driver independently of the application, otherwise in case of mismatch the host couldn't read the data. That is unless struct aren't exposed to the application code directly and

Yes it's a contract between the shader program and the application and yes they need to match. (Actually lots of issues/crashes are because they don't match. ;-P) But we are not letting the GPU driver to pick a layout independently of the application. Instead, it's the developer, who writes both the program and the application, provides the information to the shader compiler to get the information embedded in the final module and then the driver pick this information up. From the shader program and its compiler's perspective, how data is put in memory on the application side is completely unknown; only the developer knows that information. So we are requiring the developer to specify that information via a CLI option (or a pass creation parameter or whatever equivalent).

This also seems to me to contradict the "I don't think it's the frontend that should pick the layout; it's the SPIR-V backend". Unless a SPIR-V kernel is supposed to be portable across Vulkan/OpenGL/OpenCL and only at runtime when you load SPIR-V into the driver you pick your model? (I doubt it considering how much specificity for each use-case is baked into the IR).

Maybe I'm using some terminology wrongly and results in confusion... My previous comment was regarding DXC implementation specifically. The layout selection happens in the SPIR-V CodeGen there. But I guess yes you can also specify the layout directly in the source code so that qualifies as selecting the layout in a "frontend". (Although if talking about frontend I think more about compiler frontend that turns a source code into an AST, instead of the source code itself.)

You're partially answering, but this is leaving me very confused about the whole story: it just does not line up for me at the moment.

I'm sorry for the confusion! I'm actually also a bit confused about what's your deepest concerns after we've clarified that the input to this pass should not be considered as invalid. We should have some time to chat over this. :)

What I was saying is that if you import a SPIR-V module that is "layoutless" and you always add layout as part of your flow, then your not preserving the "layoutless" aspect of it.

Existing SPIR-V blobs for Vulkan must be laid out, so such case (importing a "layoutless" module) should not exist.

joker-eph commented 5 years ago

OK, but at this point the world seems fairly static: if you have the kernel capability you don't need layouts, otherwise you do. We're not yet to a point where we need to add a "default" one generically in the SPIR-V dialect itself.

I'm not sure I get your idea here. Having layout information absent does not mean we force a "default" layout; it's simply we don't have such information.

I am saying that if the capability requires a layout, then you should already have this information in the IR, and you shouldn't need a pass that adds it after. If your capability does not require a layout, then you don't need a pass to add it.

Aren't they contract with the application code?

Yes, they are contract with the application. So they are useful for further SPIR-V consumers (the driver compiler) when trying to "link" with the application. :) I don't quite see what's the issue with my original wording?

I see a contradiction between saying that we can add a layout after the fact and the claim that it is a contract. The fact that it is a contract with the host application implies to me that the lowering from the GPU dialect to SPIR-V dialect must provide a layout.

This does not make sense to me, so I probably misunderstand what you wrote above. If the layout is a contract with the application code, then this contract has to be expressed by the application and cannot be picked a GPU driver independently of the application, otherwise in case of mismatch the host couldn't read the data. That is unless struct aren't exposed to the application code directly and

Yes it's a contract between the shader program and the application and yes they need to match.

I sniped most of what you wrote, but I'm still reading a contradiction in all the parts I omitted and this last sentence.

(Actually lots of issues/crashes are because they don't match. ;-P) But we are not letting the GPU driver to pick a layout independently of the application. Instead, it's the developer, who writes both the program and the application, provides the information to the shader compiler to get the information embedded in the final module and then the driver pick this information up. From the shader program and its compiler's perspective, how data is put in memory on the application side is completely unknown; only the developer knows that information. So we are requiring the developer to specify that information via a CLI option (or a pass creation parameter or whatever equivalent).

Right, so again: this informations comes from the higher level frontend/framework during lowering, and it shouldn't be added after the fact, since it is a contract with "something else" (the application code) which isn't visible from the SPIR-V module. I still don't see the use-case for this pass right now.

This also seems to me to contradict the "I don't think it's the frontend that should pick the layout; it's the SPIR-V backend". Unless a SPIR-V kernel is supposed to be portable across Vulkan/OpenGL/OpenCL and only at runtime when you load SPIR-V into the driver you pick your model? (I doubt it considering how much specificity for each use-case is baked into the IR).

Maybe I'm using some terminology wrongly and results in confusion... My previous comment was regarding DXC implementation specifically. The layout selection happens in the SPIR-V CodeGen there. But I guess yes you can also specify the layout directly in the source code so that qualifies as selecting the layout in a "frontend". (Although if talking about frontend I think more about compiler frontend that turns a source code into an AST, instead of the source code itself.)

By frontend here, I mean anything that generates the SPIR-V dialect (just like when you come from a higher-level, targeting something like SPIR-V is a backend): multi-layer is hard :)

What I was saying is that if you import a SPIR-V module that is "layoutless" and you always add layout as part of your flow, then your not preserving the "layoutless" aspect of it.

Existing SPIR-V blobs for Vulkan must be laid out, so such case (importing a "layoutless" module) should not exist.

OK, so I think at least one thing is clear: this excludes that this pass has any use when the input is a SPIR-V binary.

antiagainst commented 5 years ago

More on this. :) After the above comments and an initial offline chat, I think the questions still remaining are:

1) Where does this layout information come from. 2) What's the use case for this pass.

Here are my thoughts on these two questions:

For 1), the layout information comes from an external source. By external, I mean external to the SPIR-V module itself. We are not inferring the layout information by using the SPIR-V module alone; it is given from an external source.

Because of progressive implementation, the current implementation of this pass does not reveal that, which I guess is causing part of the confusion. It should have an CLI option to let developers specify which layout rule to use in the future. Or as discussed and suggested offline by @joker-eph, the information can also be an attribute attached to some high-level dialect coming as input to SPIR-V lowering, assuming there is an "driver" (like IREE) coordinating both the kernel and runtime side. Then the driver can decide which layout it want to use on runtime side and put that information onto the function to be compiled into SPIR-V module.

Still, no matter which approach to take here (CLI options or function attributes), they are both external inputs to this pass.

For 2), we need to think about a few workflows.

If importing an existing SPIR-V blob, as said previously, it should already be laid out. So we don't need this pass here.

If going from some high-level dialect to SPIR-V for ML, then it falls into the case I said for 1). If we have a function attribute or something from the high-level dialect, then great we can use that. Here we still have a use case for this pass: it can consume that attribute and decorate with explicit layout. The benefits I see is we have better separated concerns and more clear boundaries. The lowering pass can just focus on turning a function into a spv.module and let this pass worry about all the layout calculation details. This pass is just another step in the lowering procedure and it's also reusable for other purposes like the following.

If we are using the SPIR-V dialect inside a shading language (like HLSL) compiler, then the input shading language may not contain layout rule. (This is the case as of today in HLSL. It is a language for DirectX but because it's widely used we also want to adopt it for Vulkan. So unsurprisingly, Vulkan specific annotations are missing in the source code sometimes. Even say in the future we add it, vendors are not always free to use because those annotations will be new and not accepted by existing HLSL compilers like fxc.exe so causing trouble for targeting multiple APIs.) Then we have no where to get such information; we can only require the developer to specify it via a CLI option. Then a pass like this is needed to calculate and attach the explicit layout according to the rule.

joker-eph commented 5 years ago

Since I still don't understand the flow here, I think we should raise this to the mailing-list and clarify. I'll start an email thread there.

Thanks for all the information though.

joker-eph commented 5 years ago

If we are using the SPIR-V dialect inside a shading language (like HLSL) compiler, then the input shading language may not contain layout rule.

To me this is the role of the HLSL compiler itself when in the IR it create the SPIR-V type to provide the layout in a way that is compatible with the execution environment (just like Clang layout C struct based on the target ABI, LLVM does not do it later)