tensorflow / mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1.73k stars 257 forks source link

Unranked MemRef Type #261

Closed nmostafa closed 4 years ago

nmostafa commented 4 years ago

This change adds a new unranked MemRef type, and runtime descriptor, to represent MemRefs with unknown compile-time rank. This enables APIs to accept a MemRef of any rank. The element type and memory space are still static in the type. Unranked MemRefs can be cast from/to using memref_cast op:

// unranked to ranked
%0 = memref_cast %arg : memref<*xf32> to memref<?x?x10x2xf32>
// ranked to unranked. 
%1 = memref_cast %arg : memref<42x2x?xf32> to memref<*xf32>

The LLVM descriptor consists of a runtime dynamic rank and a pointer to ranked MemRef llvm<"{ i64, i8* }. When passing an unranked descriptor, the callee can query the rank and parse the MemRef descriptor pointer.

When creating an unranked descriptor from a ranked one, we alloca and copy the ranked src descriptor, then create an unranked descripto and set the pointer to the allocated location. When casting ranked to unranked, we extract the ranked descriptor pointer and load it.

More context on the mailing list.

Caveats/ToDo:

  1. Currently unranked to unranked casting and returning unranked memrefs is not supported. Will require a memcpy, especially for the latter where the ranked descriptor will be on the callee's stack.
  2. Undefined behavior when casting unranked to ranked. In theory it is possible to do a runtime check on the rank and shape.
  3. Documentation update.
River707 commented 4 years ago

There is also quite a lot of missing documentation on what this type represents, e.g. in the langref.

nmostafa commented 4 years ago

Thanks for the quick feedback. @River707 , yes, I will add documentation once there is consensus on the change.

@nicolasvasilache I agree but cannot think of a good way to trigger the conversion. We could possibly unify the ABI for all external calls (if the FuncOP has no region, it is an external call) to receive unranked descriptor, but that seems restrictive. Users may want to control what type the external call receives.

Unranked MemRefs have no practical use for std internal calls at the moment, since there is no std instruction to extract the rank, but that can be added later so it can be possible to write type-generic std functions.

Other thoughts are welcome

nicolasvasilache commented 4 years ago

We could possibly unify the ABI for all external calls (if the FuncOP has no region, it is an external call) to receive unranked descriptor, but that seems restrictive. Users may want to control what type the external call receives.

We had this discussion internally and the implicitness wouldn't work for at least 2 reasons:

  1. the user control reason which you mention
  2. funcop with no body may also refer to another MLIR function in another file. This would force unranked memref everywhere which would be too intrusive.

Could we please make it super explicit both in LangRef.md where memref are defined and in ConversionToLLVMDialect.md that this is a type that is only used for lowering and avoiding the need for multi-versioning external library calls.

Rest looks good to me once review comments are addressed.

Thanks for doing this!

nicolasvasilache commented 4 years ago

Sorry one more rebase, there are conflicts now.. Let's land this first and do leave naming for bikeshedding.

River707 commented 4 years ago

Currently OOO, but unless there is a strong commitment to fix the naming immediately after submitting I am strongly against landing this as is.

nicolasvasilache commented 4 years ago

@River707 sorry this catches you while on PTO .. Did you see @bondhugula's rationale above and name proposition + my comment?

River707 commented 4 years ago

Currently OOO, but unless there is a strong commitment to fix the naming immediately after submitting I am strongly against landing this as is.

I am happy to fix the naming after, and for this PR I can use BaseMemRefType, MemRefType and UnrankedMemRefType

I’d rather not have this submitted until there is consensus on where we are going.

nicolasvasilache commented 4 years ago

I’d rather not have this submitted until there is consensus on where we are going.

Noted, let's table this until Monday when everyone is hands on deck.

Thanks!

nicolasvasilache commented 4 years ago

We had an internal discussion with @River707. We agreed it is fine to proceed with this CL as is provided more extensive documentation is added. The NFC renaming can be left for a future time if the following rationale changes.

In particular the following rationale points should be made and distributed in proper locations in the doc.

Point 1: Codegen is not expected to manipulate unranked memrefs in the future.

Codegen is concerned with generating loop nests and specialized instructions for high-performance, unranked memref is concerned with hiding the rank and thus, the number of enclosing loops required to iterate over the data. One possible path performing codegen with unranked memrefs is to cast into a ranked type (point 3 below). Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. While this is possible, it is expected to not be a good idea to perform this during codegen: the cost of the translations is expected to be prohibitive and optimizations at this level are not expected to be worthwhile. If expressiveness is the main concern, irrespective of performance, passing unranked memrefs to an external C++ library and implementing rank-agnostic logic there, is expected to be significantly simpler.

Point 2: Unranked memrefs usage limited to library interop.

This is the main objective of this PR, comments and suggestions have been made already, they should be extracted in proper docs.

Point 3: Unranked memrefs may be used as an interface in MLIR itself in the future.

Unranked memrefs may provide expressiveness gains in the future and help bridge the gap with unranked tensors. Unranked memrefs will still not expected to be exposed to codegen but one may query the rank of an unranked memref (a special op will be needed for this purpose) and perform a switch + cast to a ranked memref as a prerequisite to codegen.

@nmostafa please let use know if this makes sense and whether you could beef up the LangRef and doc in relevant places so we can push this forward.

Thank you!

nmostafa commented 4 years ago

Thanks @nicolasvasilache . I will incorporate those points in the doc. Couple of questions:

Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)

What's the consensus on the class hierarchy ? Currently we have:

ShapedType --- BaseMemRefType --- MemRefType
                           |----------- UnRankedMemRefType
River707 commented 4 years ago

Thanks @nicolasvasilache . I will incorporate those points in the doc. Couple of questions:

Another possible path is to emit a single while loop conditioned on a linear index and perform delinearization of the linear index to a dynamic array containing the (unranked) indices. Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)

What's the consensus on the class hierarchy ? Currently we have:

ShapedType --- BaseMemRefType --- MemRefType
                          |----------- UnRankedMemRefType

That is fine for now.

nicolasvasilache commented 4 years ago

Are you assuming the rank and dims can be extracted from the unranked memref ? (i.e. ability to compute the dynamic size)

Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).

nmostafa commented 4 years ago

Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).

Given that this code-gen path is not feasible (yet), do you think it should be mentioned in the docs ?

nicolasvasilache commented 4 years ago

I'll let River weigh in on this.

I was seeing this as a part of a "rationale / design alternative considered" section. This is more a justification that it makes sense to not have a correspondence between tensor types and memref types.

On Mon, Dec 2, 2019 at 5:48 PM Nagy Mostafa notifications@github.com wrote:

Yes, it would require this type of support. I have spelled out what would need to be done in this scenario so that the tradeoff is discussed. This is by no means an endorsement of going for this approach in MLIR :).

Given that this code-gen path is not feasible (yet), do you think it should be mentioned in the docs ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/mlir/pull/261?email_source=notifications&email_token=ACNNU5GZBE6G4ZYZDHAMULDQWWGDXA5CNFSM4JQW7OC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFWS7UY#issuecomment-560803795, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNNU5AQIGWJBE7YXTCI5ZDQWWGDXANCNFSM4JQW7OCQ .

-- N

nmostafa commented 4 years ago

I have update the docs. Please let me know if anything else is needed.

River707 commented 4 years ago

We can still mention it in the docs as it is part of the rationale(or atleast intended rationale) of the thing being added.

ftynse commented 4 years ago

Since River asked for more changes, please also run clang-format after you're done with those.

nmostafa commented 4 years ago

I see one fail, but have no access to the Details. Please let me know if anything is needed on my side. Thanks.

ftynse commented 4 years ago

clang-format still complains...

There are also breakages on integrating this.

nmostafa commented 4 years ago

Thanks, Alex. I don't any see any further clang-format changes on my end, using clang-format 10.0. What sort of breakages ?

nicolasvasilache commented 4 years ago

They are internal things ;) , giving a shot at addressing now.