inducer / pytato

Lazily evaluated arrays in Python
Other
8 stars 16 forks source link

Tagging ImplStored to a component of DictOfNamedArrays leads to redundant allocation #415

Closed kaushikcfd closed 1 year ago

kaushikcfd commented 1 year ago

Consider the following snippet:

import pytato as pt

x = pt.make_placeholder("x", (10, 4), "float64")
y = 2*x
y = y.tagged(pt.tags.ImplStored())
print(pt.generate_loopy(y).program)

which generates the code:

---------------------------------------------------------------------------
TEMPORARIES:
_pt_temp: type: np:dtype('float64'), shape: (10, 4), dim_tags: (N1:stride:4, N0:stride:1) aspace: global
---------------------------------------------------------------------------
INSTRUCTIONS:
  for _pt_temp_dim1, _pt_temp_dim0
↱     _pt_temp[_pt_temp_dim0, _pt_temp_dim1] = 2*x[_pt_temp_dim0, _pt_temp_dim1]  {id=_pt_temp_store}
│ end _pt_temp_dim1, _pt_temp_dim0
│ for _pt_out_dim0, _pt_out_dim1
└     _pt_out[_pt_out_dim0, _pt_out_dim1] = _pt_temp[_pt_out_dim0, _pt_out_dim1]  {id=_pt_out_store}
  end _pt_out_dim0, _pt_out_dim1
---------------------------------------------------------------------------

Notice how the "_pt_temp" serves no purpose.

Is this the user's fault or pytato's?

inducer commented 1 year ago

I'd say this is pytato. In a sense, there are valid reasons for memory to be allocated for both of these arrays. One is tagged ImplStored, and the other is the result, wrapped in a NamedArray. So (I think) I understand what pytato's guts are doing here. At the same time, it's silly to allocate multiple buffers for exactly the same data, and to expend effort to copy that data over. I think trying to avoid these needless copies would be a good thing.