tensorflow / mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1.74k stars 260 forks source link

Configurable call-back method name for AllocOp and DeallocOp #55

Open nmostafa opened 5 years ago

nmostafa commented 5 years ago

Currently Alloc/Dealloc in standard dialect are hard-coded to call malloc/free. This change adds a string attribute to the operations to hold the callback name allowing use of custom allocators. Added overloaded builders to set the attribute, and getters.

If the attribute is missing, malloc/free are used by default. The call signature remains unchanged void* malloc(IndexType) and void free(void*)

ftynse commented 5 years ago

IMO, it feels like the name of the callback is the property of the Standard->LLVM lowering and we should not pollute the operation signature with it. We could parameterize the conversion pattern with the name of the function to call instead.

nmostafa commented 5 years ago

@ftynse, that means we have to define our own Std->LLVM lowering pass to configure the pattern, no ? Also, the same callback will have be used for all AllocOps in the code. With making it part of the Op, we can use different memory managers for different parts of the code. In addition, to specify the call-back in MLIR textual format.

What do you think ?

joker-eph commented 5 years ago

Aren't we getting in the realm of needing your own AllocOp to model your memory manager instead?

ftynse commented 5 years ago

Yes, having a different pattern likely means you need a different lowering pass. Or rather an option (potentially defaulted to malloc) on the existing pass that says which function to call / pattern to select. I would prefer a pass option to cluttering every single alloc op in every module with attributes.

Following up on Mehdi's point, I would consider implementing the conversion logic from "standard" allocations to "custom" allocations as a conversion in an MLIR pass anyway, as opposed to adding attributes and using them in the conversion-to-llvm pass. If an op sounds like too much trouble, consider declaring a function and calling it to get the memory.

nmostafa commented 5 years ago

Declaring our own Allocation Op and the needed lowering pass to LLVM just seemed like an overkill for the simple need to rename the callback function (I would agree if the change in semantics is more involved). Besides always linking to "malloc" in the existing pass seems too strict to me (what if your target is not CPU or doesn't have libc). I am happy to add a flag to the existing pass instead, if that sounds good to everyone.

Regarding having our own call-backs that will currently not work unless we are allowed to instantiate a MemRef from a raw pointer (otherwise we will need multiple callbacks return types). I am aware that the RFC for MemRef views might solve this. We will likely use that approach for Dynamic MemRefs allocations with different memory managers ..etc.

joker-eph commented 5 years ago

I'm not against making it configurable, but this level of configuration in the standard lowering pass seems tricky to plumb through the API in a clean way (You'd need to start having some sort of extensible LoweringPolicy object of some sort).

bondhugula commented 5 years ago

Aren't we getting in the realm of needing your own AllocOp to model your memory manager instead?

Adding new versions of alloc op's would appear to make the logic at passes/utilities/patterns messy. For eg., SimplifyDeadAlloc and SimplifyDeadDealloc patterns won't work out of the box. Similarly, other logic that tracks provenance or escape's will also have to be able to treat new AllocOp's in the same way as the standard AllocOp. Instead, adding an attribute to the AllocOp appears appealing at first sight. I had similar thoughts on Issue #65 , but that of adding a "stack" attribute. This could be looked at during conversion to the LLVM dialect to generate llvm.alloca instead of malloc without having to worry about lifetimes. In addition, a pass/utility could convert from heap to stack (by attaching the attribute to AllocOp) using additional analysis/checks.

joker-eph commented 5 years ago

Adding new versions of alloc op's would appear to make the logic at passes/utilities/patterns messy. For eg., SimplifyDeadAlloc and SimplifyDeadDealloc patterns won't work out of the box.

This is just because we haven't rolled-out operation interfaces at the moment: we would want MLIR dialects to be able to have their own allocator and expose interfaces that these passes can operate on instead of having passes hardcoded on knowing the specific operation implementation.

joker-eph commented 5 years ago

but that of adding a "stack" attribute. This could be looked at during conversion to the LLVM dialect to generate llvm.alloca instead of malloc without having to worry about lifetimes.

It seems that this is offloading the lifetime question to any transformation that would manipulate the alloc/dealloc pair in the original function. Would such an alloc op with the "stack" attribute still have a dealloc by the way? At this point the semantic of the alloc is so different that I don't see why you wouldn't just introduce a new operation?

nmostafa commented 5 years ago

The way I see it, there are two levels of semantics for an AllocOp. For std dialect, it just says get me some memory for a MemRef. The lower level interpretation of the semantics is up to the lowering pass(es) (whether to call malloc, alloca, or any other scheme )

For StdToLLVM lowering I don't see any harm in making the semantics a bit configurable in terms of what call-back to use (with same calling interface). This still can be achieved with Op attribute that has special meaning to the lowering pass. It shouldn't be part of the op construction API (i.e. no builders take an attribute).

Having my own alloc op or lowering pattern for this case will just be unneeded code duplication (emit code to compute memref byte size + call-back generation). For totally different semantics, having a different pattern matcher or different op makes sense.

bondhugula commented 5 years ago

but that of adding a "stack" attribute. This could be looked at during conversion to the LLVM dialect to generate llvm.alloca instead of malloc without having to worry about lifetimes.

It seems that this is offloading the lifetime question to any transformation that would manipulate the alloc/dealloc pair in the original function. Would such an alloc op with the "stack" attribute still have a dealloc by the way?

Yes, such an alloc op will have a dealloc, to mark the end of all its uses; it's just that the lowering will swallow it.

At this point the semantic of the alloc is so different that I don't see why you wouldn't just introduce a new operation?

I don't have a good answer here. A new op is one way to do it (the issue with analyses/transforms is well-handled with the operation interfaces proposal), but an attribute would lead to even less additional code. It all depends on where and how the attribute gets looked.

joker-eph commented 5 years ago

but that of adding a "stack" attribute. This could be looked at during conversion to the LLVM dialect to generate llvm.alloca instead of malloc without having to worry about lifetimes.

It seems that this is offloading the lifetime question to any transformation that would manipulate the alloc/dealloc pair in the original function. Would such an alloc op with the "stack" attribute still have a dealloc by the way?

Yes, such an alloc op will have a dealloc, to mark the end of all its uses; it's just that the lowering will swallow it.

My problem with this is that this is breaking the contract (or adding a new contract) between alloc and dealloc: outlining the block with the alloc but not the dealloc would not be legal anymore for instance.

bondhugula commented 5 years ago

but that of adding a "stack" attribute. This could be looked at during conversion to the LLVM dialect to generate llvm.alloca instead of malloc without having to worry about lifetimes.

It seems that this is offloading the lifetime question to any transformation that would manipulate the alloc/dealloc pair in the original function. Would such an alloc op with the "stack" attribute still have a dealloc by the way?

Yes, such an alloc op will have a dealloc, to mark the end of all its uses; it's just that the lowering will swallow it.

My problem with this is that this is breaking the contract (or adding a new contract) between alloc and dealloc: outlining the block with the alloc but not the dealloc would not be legal anymore for instance.

Such an outlining would have to look at the alloc attribute among the checks it does. But even with a new alloc op, you'll have to anyway perform checks like these: for eg. if you are outlining a block with an alloca op that has some of its result's uses outside of the block. And by 'block' if you meant a region's block, the dealloc would also be in the same block as the alloc, and the property that the former dominates the latter would always be maintained (for alloc's with the stack attribute).

nmostafa commented 5 years ago

Hi Mehdi, What is the call on this one ?

I still believe that having some level of configuration for AllocOp is the way to go, and to serve as a alloca (per Uday's comments) , it would have to be at the instruction level.

joker-eph commented 5 years ago

Hi Nagy,

I rather have a community based discussion than having someone (or a single group/company) calling the shots. We about bringing it up for discussion with a wider audience than this pull-request? We can put it on the agenda for the open-meeting on Thursday, but it would even be nice to summarize the problem and the possible options forward in a RFC email on the mailing-list. Would you be up for writing this?

Thanks.

nmostafa commented 5 years ago

Yes, sure. I will put together a summary of the problem and proposed solutions. Thanks for your input.

joker-eph commented 5 years ago

Did you send an RFC to the mailing list?

nmostafa commented 5 years ago

Just did. Sorry for the delay. https://groups.google.com/a/tensorflow.org/forum/?nomobile=true#!topic/mlir/MO0CNmMOa0M