mcabbott / Tullio.jl

MIT License
607 stars 28 forks source link

Add Finch.jl backend #171

Open MasonProtter opened 1 year ago

MasonProtter commented 1 year ago

cc @willow-ahrens, Finch.jl is a tensor compiler for sparse multidimensional arrays, so it'd be cool if we could bring it under the Tullio.jl umbrella!

willow-ahrens commented 1 year ago

Yes, absolutely! A few questions/comments.

  1. From my initial read, the Tullio macro expresses a (possibly compound) pointwise expression, to be eventually surrounded by nested loops. Is that correct?
  2. Is there an intermediate representation I can use after parsing but before lowering? Do most Tullio backends just use the input syntax directly?
  3. Some of the modifications to indices (e.g. mod) don't currently have Finch analogues, so that will have to wait till we implement those in Finch.
  4. We may want some sparsity-specific preprocessing to make it easier to use. For example, not all sparse formats support efficient random access, so we may want to automatically transpose and reformat some inputs before running the kernel, an example of this is described here
mcabbott commented 11 months ago

Sorry I thought I replied to this, but must not have pasted what I wrote. Maybe this indicates how active this project is -- I aim to keep it working but don't have time for much re-thinking.

From my initial read, the Tullio macro expresses a (possibly compound) pointwise expression, to be eventually surrounded by nested loops. Is that correct?

Yes that's exactly right. In particular, this means it deviates from Einstein notation as something like z[i] := x[i,j]*x[i,j] + y[i] has one loop nest, and y is inside the sum.

Is there an intermediate representation I can use after parsing but before lowering? Do most Tullio backends just use the input syntax directly?

Unfortunately this is not very modular. The three backends are Base, LoopVectorization, KernelAbstractions. Different methods of the act! function are generated for each, all at once. Everything that is parsed/resolved ends up in store::DotDict which is used to write code. @tullio verbose=2 prints it out. Code for the gradients is generated at the same time, by different source.

Adding a 4th backend which similarly builds loops with @finch in front would probably not be very hard. storage_type would have to recognise Finch's sparse array types, to dispatch to them.

willow-ahrens commented 11 months ago

Adding a 4th backend which similarly builds loops with @finch in front would probably not be very hard. storage_type would have to recognise Finch's sparse array types, to dispatch to them.

I agree, especially for the simpler expressions. Finch has similar features to storage_type in https://github.com/willow-ahrens/Finch.jl/blob/1b0539cc04556d7a92871d5af9c2221cecc099e1/src/base/broadcast.jl#L43 and similar features to https://github.com/mcabbott/Tullio.jl#notation in https://github.com/willow-ahrens/Finch.jl/blob/1b0539cc04556d7a92871d5af9c2221cecc099e1/src/transforms/wrapperize.jl#L10. The challenge will be implementing features where semantics maybe disagree between the two systems. I won't have time to work on this till January, but perhaps we can start with a simple prototype that implements most of the stuff, and add remaining features later.

willow-ahrens commented 5 months ago

At this point, we're slowly approaching a point of Finch development where this issue can begin to move forward. We currently implement an @einsum (with opportunities for fused execution with other array operations)

https://github.com/willow-ahrens/Finch.jl/pull/454