nod-ai / SHARK-Turbine

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
82 stars 41 forks source link

[model] Write and shard SDXL into 8 partitions #700

Open antiagainst opened 1 month ago

sogartar commented 1 week ago

I have added Resnet block sharding.

The multi device work in IREE is still ongoing. Maybe I should keep going with sharding the rest of the model in sharktank instead of trying to run the sharded Resnet block in IREE.

There is one important missing functionality. The ability to select a sharding algorithm for ops. It is similar to how you could inject the model parameters when constructing the model. They are all named. We need something similar. Nested naming for operations and layers/blocks. Without this if we get incompatible argument shardings for an op that require resharding we may choose a bad algorithm that influences the sharding of the op's output and may have downstream performance effects. We can't control that now.