NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
274 stars 53 forks source link

Plumb automatic schedulers into Python interface #1145

Open jacobhinkle opened 1 year ago

jacobhinkle commented 1 year ago

For autotuning, and for developing heuristics, it would be convenient to be able to schedule complicated fusions from the Python side. Currently, doing so requires us to write the whole schedule from the primitives like split, parallelize, etc. However, it would be great to be able to create in Python some surrogate of, say a PointwiseParams or MatmulParams object and call schedulePointwise or scheduleMatmul with it. The key is that this would let us manipulate the heuristic in order to explore different regimes of our automatic schedulers, without needing to reproduce the complicated logic in the schedule* functions.

Each param object inherits from HeuristicParams but appends a number of custom attributes, some of which are compound types. For example, MatmulParams contains, among other things, a MatMulTileOptions struct that contains multiple GemmTile structs each holding a triple of ints. We could represent this with a dict in Python, but we will need to maintain a translation function for each type heuristic parameter object.

Instead we could try to devise a system where we could automatically convert from a nested dictionary of simple types. For example, each of the structs would declare names and types for their members using a macro that would generate the conversion code.

struct GemmTile : public FromDict {
 FROM_DICT_INT_ATTR(m);
 FROM_DICT_INT_ATTR(n);
 FROM_DICT_INT_ATTR(k);
  // rest of definition unmodified 
}

struct MatMulTileOptions : public FromDict {
  FROM_DICT_CLASS_ATTR(GemmTile, cta_tile, 128, 128, 32);
  FROM_DICT_CLASS_ATTR(GemmTile, warp_tile, 64, 64, 32);
  FROM_DICT_CLASS_ATTR(GemmTile, instruction_tile, 16, 8, 16);
  // ...
}

// HeuristicParams would inherit from FromDict
struct MatmulParams : public HeuristicParams {
  // ...
  FROM_DICT_BOOL_ATTR(async_gmem_load_operands, false);
  FROM_DICT_CLASS_ATTR(MatMulTileOptions, tile_sizes);
  // ...
}

// Then we'd automatically generate a static method
// MatmulParams MatmulParams::fromDict(const std::unordered_map<std::string, std::variant<...>>& d);

Plumbing these in to the python frontend would then be very simple, and presumably this would also make serde simple for the heuristic info.

Is it possible to create such a class FromDict and associated macros? Is this pattern already implemented somewhere?

jacobhinkle commented 1 year ago

cc @kevinstephano

jacobhinkle commented 2 months ago

See also the more recent issue #2418

jacobhinkle commented 1 month ago

Fixed by #3106