Closed nmostafa closed 4 years ago
There is also quite a lot of missing documentation on what this type represents, e.g. in the langref.
Thanks for the quick feedback. @River707 , yes, I will add documentation once there is consensus on the change.
@nicolasvasilache I agree but cannot think of a good way to trigger the conversion. We could possibly unify the ABI for all external calls (if the FuncOP has no region, it is an external call) to receive unranked descriptor, but that seems restrictive. Users may want to control what type the external call receives.
Unranked MemRefs have no practical use for std internal calls at the moment, since there is no std instruction to extract the rank, but that can be added later so it can be possible to write type-generic std functions.
Other thoughts are welcome
We could possibly unify the ABI for all external calls (if the FuncOP has no region, it is an external call) to receive unranked descriptor, but that seems restrictive. Users may want to control what type the external call receives.
We had this discussion internally and the implicitness wouldn't work for at least 2 reasons:
Could we please make it super explicit both in LangRef.md where memref are defined and in ConversionToLLVMDialect.md that this is a type that is only used for lowering and avoiding the need for multi-versioning external library calls.
Rest looks good to me once review comments are addressed.
Thanks for doing this!
Sorry one more rebase, there are conflicts now.. Let's land this first and do leave naming for bikeshedding.
Currently OOO, but unless there is a strong commitment to fix the naming immediately after submitting I am strongly against landing this as is.
@River707 sorry this catches you while on PTO .. Did you see @bondhugula's rationale above and name proposition + my comment?
Currently OOO, but unless there is a strong commitment to fix the naming immediately after submitting I am strongly against landing this as is.
I am happy to fix the naming after, and for this PR I can use
BaseMemRefType
,MemRefType
andUnrankedMemRefType
I’d rather not have this submitted until there is consensus on where we are going.
I’d rather not have this submitted until there is consensus on where we are going.
Noted, let's table this until Monday when everyone is hands on deck.
Thanks!
We had an internal discussion with @River707. We agreed it is fine to proceed with this CL as is provided more extensive documentation is added. The NFC renaming can be left for a future time if the following rationale changes.
In particular the following rationale points should be made and distributed in proper locations in the doc.
Codegen is concerned with generating loop nests and specialized instructions for high-performance, unranked memref is concerned with hiding the rank and thus, the number of enclosing loops required to iterate over the data. One possible path performing codegen with unranked memrefs is to cast into a ranked type (point 3 below). Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. While this is possible, it is expected to not be a good idea to perform this during codegen: the cost of the translations is expected to be prohibitive and optimizations at this level are not expected to be worthwhile. If expressiveness is the main concern, irrespective of performance, passing unranked memrefs to an external C++ library and implementing rank-agnostic logic there, is expected to be significantly simpler.
This is the main objective of this PR, comments and suggestions have been made already, they should be extracted in proper docs.
Unranked memrefs may provide expressiveness gains in the future and help bridge the gap with unranked tensors. Unranked memrefs will still not expected to be exposed to codegen but one may query the rank of an unranked memref (a special op will be needed for this purpose) and perform a switch + cast to a ranked memref as a prerequisite to codegen.
@nmostafa please let use know if this makes sense and whether you could beef up the LangRef and doc in relevant places so we can push this forward.
Thank you!
Thanks @nicolasvasilache . I will incorporate those points in the doc. Couple of questions:
Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)
What's the consensus on the class hierarchy ? Currently we have:
ShapedType --- BaseMemRefType --- MemRefType
|----------- UnRankedMemRefType
Thanks @nicolasvasilache . I will incorporate those points in the doc. Couple of questions:
Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)
What's the consensus on the class hierarchy ? Currently we have:
ShapedType --- BaseMemRefType --- MemRefType |----------- UnRankedMemRefType
That is fine for now.
Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)
Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).
Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).
Given that this code-gen path is not feasible (yet), do you think it should be mentioned in the docs ?
I'll let River weigh in on this.
I was seeing this as a part of a "rationale / design alternative considered" section. This is more a justification that it makes sense to not have a correspondence between tensor types and memref types.
On Mon, Dec 2, 2019 at 5:48 PM Nagy Mostafa notifications@github.com wrote:
Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).
Given that this code-gen path is not feasible (yet), do you think it should be mentioned in the docs ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/mlir/pull/261?email_source=notifications&email_token=ACNNU5GZBE6G4ZYZDHAMULDQWWGDXA5CNFSM4JQW7OC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFWS7UY#issuecomment-560803795, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNNU5AQIGWJBE7YXTCI5ZDQWWGDXANCNFSM4JQW7OCQ .
-- N
I have update the docs. Please let me know if anything else is needed.
We can still mention it in the docs as it is part of the rationale(or atleast intended rationale) of the thing being added.
Since River asked for more changes, please also run clang-format after you're done with those.
I see one fail, but have no access to the Details. Please let me know if anything is needed on my side. Thanks.
clang-format still complains...
There are also breakages on integrating this.
Thanks, Alex. I don't any see any further clang-format changes on my end, using clang-format 10.0. What sort of breakages ?
They are internal things ;) , giving a shot at addressing now.
This change adds a new unranked MemRef type, and runtime descriptor, to represent MemRefs with unknown compile-time rank. This enables APIs to accept a MemRef of any rank. The element type and memory space are still static in the type. Unranked MemRefs can be cast from/to using memref_cast op:
The LLVM descriptor consists of a runtime dynamic rank and a pointer to ranked MemRef
llvm<"{ i64, i8* }
. When passing an unranked descriptor, the callee can query the rank and parse the MemRef descriptor pointer.When creating an unranked descriptor from a ranked one, we alloca and copy the ranked src descriptor, then create an unranked descripto and set the pointer to the allocated location. When casting ranked to unranked, we extract the ranked descriptor pointer and load it.
More context on the mailing list.
Caveats/ToDo: